J mud 


<_< 


WN TERDISCIPLINARITY 


Why scientists must 
work together to save 
the world PAGE305 


| ENVIRONMENT ~<iy es | swenecae Pe Saane. COM /NATURE 


SOMETHING KiFEWITHOUT | SELF-DEFENCE@@tss~re———— 
BORDERS S| COURSE 


. ee nas : 38> 
Mixed legacy.of Victorian Wew rechnologies can make 
explorer Richard Burton sense of human immunity 
PARESIS. SN PARE 408 9 A? 700288085095 


THIS WEEK 


OVERSIGHT More institutions 
must check on animal 
experiments p.290 


EDITORIALS 


WORLD VIEW My unhappy GONE FISSION Nazi 
time as an integrated uranium never saw the 
social scientist p.291 inside of a reactor p.292 


ae ae 


Too close for comfort? 


Relationships between industry and researchers can be hard to define, but universities and other 
institutions must do more to scrutinize the work of their scientists for conflicts of interest. 


hat sort of industry connections could buy influence over 
W: scientist’s research results? Research grants as small as 

US$5,000? Money to support outreach that bolsters the 
industry's image? Equity in a spin-off company founded by the scientist? 
Defining what constitutes a conflict of interest — much less regulating 
it — continues to vex funding agencies, journals and institutions. Last 
month, for instance, Nature revealed that an activist organization had 
filed freedom-of-information requests to see the e-mails of research- 
ers who work on genetically modified crops (see Nature 524, 145-146; 
2015). Among other findings, their haul revealed that plant scientist 
Kevin Folta at the University of Florida in Gainesville had accepted a 
no-strings-attached $25,000 grant from the agriculture giant Monsanto. 

In his defence, Folta argued that the money supported only travel 
and outreach, not research, and he was therefore under no obligation 
to disclose it. This seems to be consistent with his institution’s guide- 
lines, and there is no evidence of any wrongdoing or that his research 
was compromised. 

Solar physicist Willie Soon, a climate-change sceptic at the Harvard- 
Smithsonian Center for Astrophysics in Massachusetts, also seems to 
have been operating within institutional policy when advocacy groups 
revealed in February that he had accepted more than $1 million from 
the energy industry, among other funders. (However, his failure to 
disclose those relationships might have violated the policies of some 
journals in which he published; see Nature http://doi.org/2jx (2015).) 

In trying to navigate such complexities, the US National Institutes 
of Health (NIH) has been ahead of the curve — presumably because of 
long-standing concerns about physicians’ industry relationships and the 
high stakes for protecting patients. Its parent agency, the Department of 
Health and Human Services (HHS), was the first to establish conflict-of- 
interest disclosure rules in 1995 and is still beyond many ofits counter- 
parts in maintaining unified regulations that include yearly reports to 
the government. By contrast, as one example, the US National Science 
Foundation’s grants policy suggests that institutions look to scientific 
societies for ideas on how to managea conflict of interest, and to report 
back to the foundation only if institutions cannot handle it themselves. 

But even the HHS rules were not enough to guarantee full transpar- 
ency. In 2009, a congressional report and subsequent media coverage 
found that some NIH-funded researchers had quietly accepted millions 
of dollars from industry. Again, the blame kept shifting: the universities 
said that the researchers had not reported the conflicts, the NIH received 
only bare-bones reports from institutions, and the researchers said that 
they did not know they were breaking any rules. 

The HHS updated its policies in 2011, but pleased no one. The 
government underestimated the time and money that institutions would 
spend implementing new rules. And some aspects of the reforms have 
proved to be window dressing: a Nature investigation this week reveals 
that these reforms have uncovered few conflicts of interest that would 
have escaped the original regulations (see page 300). 


The reforms may not be perfect, but they address real issues and others 
should take note. They make it clear that institutions are accountable, 
that they must educate their researchers on financial disclosure and that 
they should evaluate whether an industry relationship is problematic. 
The reforms also enlist a second pair of eyes by requiring institutions to 
report details of the conflict and its management to the NIH. Perhaps 
most importantly, the reforms remove the excuse of plausible deniability 

by clearly stating the kinds of financial rela- 


“The reforms tionship that could be considered conflicts. 

may not be One thing has become clear: conflicts are 
perfect, but slippery to define, so it is important for as 
others should many funders, institutions and journals to 


make as many demands as necessary. Had 
Kevin Folta been funded by the NIH, the HHS 
guidelines would have required him to report the Monsanto money. 
And if Willie Soon had had an NIH grant, his institution would have 
designed a ‘management plar that could have required his industry 
relationships to be stated in publications and lectures. 

The HHS rules could backfire. Institutions do not want the publicity 
and work that accompany an identified conflict. Because they hold the 
power to decide whether a relationship presents a conflict, they could 
theoretically give their researchers a pass. Nature’s investigation suggests 
that institutions use vastly different standards to evaluate such relation- 
ships, meaning that the rule is unevenly applied. And the current system 
makes it difficult for the public to access the conflict reports. 

Still, the HHS should be commended for at least attempting to 
address the problem, even if it was forced into doing so. Other funders 
and institutions could do worse than to learn from its successes and 
mistakes if they define and strengthen their own policies. = 


take note.” 


Mind meld 


Interdisciplinary science must break down 
barriers between fields to build common ground. 


cleaner repairs, and in the Czech Republic town of Kostelec nad 
Orlici, a business will sell you both wine and underwear. Such odd 
couplings are humorous because of their curiously limited scope. 


There is nothing 
| INTERDISCIPLINARITY 


funny, after all, 
ane t 

A Nature special issue apouPamae eestor 

nature.com/inter 


I: Castlegar, Canada, there is a golf shop that also offers vacuum- 


that repairs equip- 
ment and sells golf 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 289 


© 2015 Macmillan Publishers Limited. All rights reserved 


| THIS WEEK | EDITORIALS 


clubs, wine, underwear and everything else under the Sun. 

The binary combinations also lead us to assume something about 
the shop’s owners. Faced with a specific set of circumstances, these 
businesses redefine what we expect from a shop and offer something 
distinct. 

There are greater problems in the world than what to do with your 
vacuum cleaner while you decide what make of balls to buy, but the 
principle is worth remembering as you browse this week’s special issue 
of Nature, which we dedicate to interdisciplinary science. 

Most scientists are aware of the term, and many will have used it. But 
how many are truly engaged in it? Done correctly, it is not mere multi- 
disciplinary work — a collection of people tackling a problem using their 
specific skills — but a synthesis of different approaches into something 
unique. It is the wine and underwear shop, not the hypermarket. 

The best interdisciplinary science comes from the realization that 
there are pressing questions or problems that cannot be adequately 
addressed by people from just one discipline. Witness the gathering 
of the scientific tribes — and the merging of approaches — for the 
Manhattan Project to work on the atomic bomb. More recently, Nature 
has reported on ‘implementation science} which combines medical 
expertise with local knowledge on how best to carry out programmes 
to improve public health (see Nature 523, 516-518; 2015). 

An interdisciplinary approach should drive people to ask questions 
and solve problems that have never come up before. But it can also 
address old problems, especially those that have proved unwilling to 
yield to conventional approaches. 

Enough of the rhetoric, what about the reality? It is hard to deny that 
the scientific system — from funding streams and academic rewards 
to university departments and journals — does not encourage much 


overlap between disparate subjects. It is easy to set up a ‘Centre for 
Interdisciplinary Research, but who will be prepared to join it? If 
governments, funders and universities want to encourage more basic 
researchers to leave their trenches, then they need to make the no- 
mans-land of interdisciplinarity a more welcoming place to build a 
career. The obstacles are many, as we discuss in the pages that follow. 
Some groups have found ways to overcome these obstacles, and 
some high-quality interdisciplinary work is 


“True under way. What are the key lessons from 
interdisciplinary _ these successes? 

science cannot Interdisciplinary science takes longer than 
be rushed.” conventional projects, and that makes it more 


expensive. Funders most accept and embrace 
this and hold their nerve if the pay-off from individual projects takes 
longer than expected. 

True interdisciplinary science cannot be rushed, not least because 
the best course of investigation is rarely clear at the outset. Research 
questions must be assessed and decided with input from all involved. 
An interdisciplinary project cannot exist as one main subject that 
sucks in the majority of the resources and leaves the partners as orbit- 
ing satellites. 

Communication is crucial. The varying use of language across disci- 
plines might seem a superficial problem, but it is one that must be solved, 
or misunderstandings will undermine the foundations of the project. 
There must also be no hierarchy, or perceived hierarchy. All involved 
must be confident that colleagues from other disciplines use equal aca- 
demic rigour and scientific standing, even if the methods used in rival 
fields seem alien. It takes time to see the value in other approaches. It 
takes an open mind to appreciate an appliance-mending golf shop. m 


Protection priority 


Allinvolved in animal research must ensure 
that rules for ethical experiments are observed. 


ore than a million people in Europe signed a petition earlier 
M this year to halt research with animals. One reason why 

Nature and many scientists are able to defend these experi- 
ments is that all involved do everything they can to minimize pain and 
suffering. Animal experiments are approved only after thorough discus- 
sion and are carried out according to strict regulatory controls. Society 
sees the benefits of animal research, but it does not seek them at any cost. 

When breaches of the strict rules that govern animal research occur, 
it is vital — to both supporters and opponents — that they are inves- 
tigated thoroughly, and that lessons are learnt and shared. This week, 
Nature publishes a correction on its website that details such a breach 
of experimental protocol in a previously published paper (L. Raj et al. 
Nature http://dx.doi.org/10.1038/nature 15370; 2015). 

The relevant experiments grew tumours in mice as a way to test 
possible treatments. This type of study is common, as is the way 
they are approved and regulated. Researchers typically plan the 
experiments and then submit details to an institutional review board 
for approval. In making its decision, the board follows guidelines 
set out by a separate body charged with oversight of animal pro- 
cedures — an institutional animal care and use committee. These 
guidelines are country-specific, and in the case of tumour experi- 
ments should include limits on the maximum tumour size allowed, 
and instructions to the researchers to monitor both tumour size and 
signs of distress. 

In this case, prompted by a complaint from a reader and follow- 
ing consultation with the authors and the relevant bodies, Nature 
has established that the scientists did not carry out the required 


monitoring properly. As a result, some of the tumours grew larger 
than permitted. These mice could therefore have experienced more 
pain and suffering than originally allowed for. 

As well as writing to correct their paper to mark the breach of animal- 
welfare guidelines, the authors apologize for the breach. They are right to 
do so. Cases suchas this could provoke a justifiable backlash against ani- 
mal research. All involved — scientists, institutions, funders and jour- 
nals — must do more to ensure that regulations are strictly observed. 

Nature’s policy is that the corresponding author on a paper that 
reports experiments with animals must confirm that the research was 
carried out in accordance with the relevant rules (see go.nature.com/ 
a9pjym). Asa result of this case, we are increasing the amount of infor- 
mation we request from authors. In experiments in which tumours are 
grown, we now require authors to include the maximal tumour size 
permitted by the institutional animal-use committee, and to state that 
this was not exceeded. Authors must also provide the source data for 
any figures that analyse tumour growth. 

Nature does not want to publish the results of experiments that have 
not been performed under ethical guidelines. As such, the authors in 
this case are correcting their paper to withdraw the portion of the data 
collected in experiments that the institutional committee concluded 
were in breach. The scientific conclusions of the paper remain valid and 
useful, and still stand. 

Institutions should do more to make sure that the guidelines they set 
are respected. At the very least, on completion of each project — and 
before data are submitted — institutions should verify that approved 
protocols were followed. Funders and institutions must consider 
better training for young researchers doing work with animals. And 
the broader community should continue to scrutinize and improve 
how it carries out these types of experiment. Discussions are already 
under way, for example, on whether the con- 
trol arms of similar cancer studies truly need 
to let (untreated) tumours grow as large as they 
currently do. Nature is happy to join these 
discussions and to help to improve practice. m 


> NATURE.COM 

To comment online, 
click on Editorials at: 
go.nature.com/xhunqv 


290 | NATURE | VOL 525 | 17 SEPTEMBER 2015 | CLARIFIED ONLINE 16 SEPTEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


WORLD VIEW  jennisicos sen 


addresses the challenges and opportunities of an inherently 

interdisciplinary world. Policymakers and influential voices in 
science — including Nature — have also warned ofa worrying discon- 
nect between research and the needs and concerns of the public. One 
proposed solution is the integration of social scientists such as myself 
into publicly funded research initiatives. This is expected to contribute 
to the production of ‘better’ science. 

Not in my experience. I spent three years as an in-house social 
scientist at the Cornell NanoScale Science and Technology Facility in 
Ithaca, New York, and the US National Nanotechnology Infrastructure 
Network, and it was a futile and frustrating time. I left a decade ago, 
but friends and colleagues who have since worked on similar projects 
tell me that the problem is widespread and that 
little has changed. Too many in the physical and 
life sciences dismiss social sciences as having a 
‘service role, being allowed to observe what they 
do but not disturb it. 

In its current model, integration is fuelled by 
the assumption that projects bring in the social 
sciences to carve a place for ‘society: This is 
expected to maximize the benefits of research 
while reducing negative impacts and public con- 
troversy. In other words, rather than being sci- 
entists in our own right, we are brought along as 
silent partners whose job it is to care for science. 
Rather than blurring boundaries and labour divi- 
sions, integration works to reify them. Thus, the 
questions that social scientists ask and the exper- 
tise we can contribute are muted or made invisible 
because we remain outside ‘proper’ science. 

Integration is also deeply asymmetrical. The social sciences (often a 
single social scientist) are typically brought in after the project has taken 
shape. This asymmetry is present in every aspect ofintegration — from 
power to personnel numbers, funding, knowledge production and, 
ultimately, independence — but remains hidden in mundane inter- 
actions that dictate what counts as a valid social-science activity and 
who gets to define it. 

This is not genuine integration. It pays lip service to the idea and isa 
waste of everyone’ time and the public money that supports it. 

When I began my work alongside the nanotechnology scientists, I 
naively expected that my expertise as an ethnographer would be useful. 
I was prepared to study the culture of a laboratory and to probe its inter- 
action with wider society. I thought that this would be helpful, given the 
frequent statements made by nanotechnology 


hare and institutions increasingly prioritize research that 


experts about how they wanted to engage and DNATURE.COM 
talk about the risks and benefits of their work. Discuss this article 
Instead, the other scientists seemed to view my _ online at: 


role as one of managing a narrowlist of possible —_ go.nature.com/3u8ge9 


FORTHE SOCIAL 
SCIENCES TO MAKE 


MEANINGFUL 


CONTRIBUTIONS, 


FUNDING 


STRUCTURES MUST BE 


RETHOUGHT. 


Integration of social science 
into research is crucial 


Social scientists must be allowed a full, collaborative role if researchers are to 
understand and engage with issues that concern the public, says Ana Viseu. 


risks and consequences, so that ifa researcher followed my instructions 
and ticked boxes, then I would bless them as ‘social and ethical’ and 
they would be free to do their work with no concerns. I was routinely 
(wrongly) introduced as an ethicist and was expected to find minimal, 
non-disruptive ways of dealing with social and ethical issues. This was 
not a job that I could do nor wanted to do. Worse, my attempts to 
build bridges with my technical colleagues, for example by donning a 
cleanroom suit and learning how to use some of the equipment, were 
classified in lab annual reports as “outreach”. My perceived contribution 
was not one of expertise, but rather of a willingness to be educated in 
the proper way of thinking about nanotechnology. 

Although my experience has left me sceptical of integration, Iam not 
ready to dismiss the idea of fruitful collaboration between the natural 
and social sciences. Some fixes could be easily 
implemented: initiatives aiming for integration 
should have teams of social scientists, instead of 
one or two individuals, and these teams should be 
given the financial and operational autonomy to 
define and implement their activities. 

When integration is planned, there should be 
a reassessment of what social scientists call the 
‘positionality’ of the projects, which determines 
who pays for the research and thus who has the 
power to decide what is done, how it is done and 
what can be said about it. 

For the social sciences to make meaningful 
contributions, funding structures must also be 
rethought. Ideally, we would see increases in 
stand-alone funding for social-science strands 
without requirements for integration or subordi- 
nation to a topic. But this seems unlikely. There- 
fore, we must push for project funding structures that — from the 
start — allocate and ring-fence money for the social-science component. 

But this is not enough. For ‘integration to be productive, we must 
change its very meaning, from one of service to collaboration between 
equals. Doing so involves changes to scientific education and practice as 
wellas continued reframing of our definitions of success. We must insist 
on the value of complexity, so that divergent thinking is not eclipsed in 
the effort to speak with one voice. We must make room for the disputes 
that are at the centre of knowledge production. 

This is all the more important because, in a world of decreased 
funding for social sciences and humanities, speaking out of tune is 
both difficult and crucial. So we must begin to think of new means of 
partnership that will benefit us all. m 


Ana Viseu is associate professor at the Universidade Europeia in Lisbon, 
and a member of the Centro Interuniversitdrio de Historia das Ciéncias e 
Tecnologia, Universidade de Lisboa de Ciéncias, University of Lisbon. 
e-mail: ana@anaviseu.org 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 291 


© 2015 Macmillan Publishers Limited. All rights reserved 


RESEARCH HIGHLIGHTS 


Climate sceptics 
use strong words 


Climate scientists use more 
cautious language in scientific 
reports than do climate- 
change sceptics, even though 
the sceptics often accuse the 
scientists of being alarmist. 

Srdan Medimorec and 
Gordon Pennycook at the 
University of Waterloo 
in Canada used software 
to analyse the style of 
language in a report by the 
Intergovernmental Panel 
on Climate Change (IPCC) 
in 2013 and in a response 
written by a sceptic group, 
the Nongovernmental 
International Panel on 
Climate Change (NIPCC). 
The researchers did not assess 
the scientific accuracy of the 
reports but found that the 
NIPCC report used emotional 
language and the IPCC report 
contained more neutral and 
formal phrasing. 

The authors hypothesize 
that the IPCC uses such 
language because of scrutiny 
from the media and sceptics. 
Clim. Change http://doi.org/7mb 
(2015) 


NUCLEAR PHYSICS 


Forensics reveals 
uranium’s past 


Uranium from German 
experiments during the Second 
World War was not used 

in a nuclear reactor for any 
appreciable amount of time. 


Selections from the 
scientific literature 


ANIMAL BEHAVIOUR 


Whales that click create cliques 


Sperm whales form clans by learning vocal calls 
from others that sing like them. This kind of 
‘cultural transmission has been seen as a mainly 
human trait. 

Sperm-whale clans use distinct dialects of 
clicks to communicate. To learn how their 
complex societies form, Mauricio Cantor 
at Dalhousie University in Halifax, Canada, 
and his colleagues used 18 years of data on 
the acoustic calls of sperm whales (Physeter 
macrocephalus; pictured) from around the 
Galapagos Islands to build several possible 


models of whale populations. In their 
simulations, the clans that have been observed 
in nature did not form when the vocal calls were 
genetically inherited or learned from other 
sperm whales in general. But clans did form 
when the animals adopted the most common 
calls produced by certain individuals — mainly 
those with similar communication patterns. 
This further suggests that humans are not 
the only mammals that segregate according to 
similarities in learned behaviour. 
Nature Commun. 6, 8091 (2015) 


Maria Wallenius at the 
European Commission Joint 
Research Centre's Institute 
for Transuranium Elements 
in Karlsruhe, Germany, and 
her colleagues did a forensic 
analysis of uranium samples 
(pictured) used in 1940s 
experiments in Germany. 
They looked for trace elements 
and isotopes of uranium and 
plutonium that are created 
when neutrons released during 
nuclear fission smash into 
other atoms. 

They traced the origin 
of the uranium toa mine 
in the Czech Republic, and 


292 | NATURE | VOL 525 | 17 SEPTEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


found that isotope ratios 
matched those found in 
natural uranium ore. The 
samples were never used in 
experiments that reached the 
critical mass necessary for 
sustained nuclear fission. 
Angew. Chem. Int. Ed. http://doi. 
org/f3f7js (2015) 


Pe CANCER 
Atrap for roving 
cancer cells 


Implanting a polymer scaffold 
in mice that have tumours 
captures spreading cancer cells, 


enabling their early detection. 
Lonnie Shea at the University 
of Michigan in Ann Arbor and 
his colleagues placed human 
breast-cancer cells in mice and 
implanted the scaffolds in their 
abdomens a week later. Two 
weeks after cell transplantation, 
the researchers detected cancer 
cells in the scaffolds but not 
in the lungs or liver, where 
breast cancer often spreads. 
After 28 days, mice with 
scaffolds had fewer tumours 
in their lungs than did animals 
without scaffolds. And using 
an imaging technique, the team 
measured changes in the tissue 


FLIP NICKLIN/MINDEN PICTURES/FLPA 


EUROPEAN COMMISSION 


ALEX WILD 


properties within the scaffold 
that indicated the presence of 
cancer cells. 

An inflammatory response 
to the scaffold attracted the 
cancer cells. This approach 
could eventually be used 
in humans to detect the 
early spread of cancer, the 
authors say. 

Nature Commun. 6, 8094 (2015) 


PLANETARY SCIENCE 


A faster spin for 
Mercury 


Mercury rotates nine seconds 
faster than scientists had 
thought, probably because 

of gravitational effects from 
Jupiter. 

A team led by Alexander 
Stark of the German Aerospace 
Center in Berlin studied three 
years of data from NASA's 
MESSENGER spacecraft, 
which orbited the planet 
between 2011 and 2015 and 
measured Mercury’s rotations 
more precisely than ever before. 

The data also confirm that 
the planet has a molten outer 
core, causing this part to rotate 
at a different speed from the 
solid inner layers. 

Geophys. Res. Lett. http://doi. 
org/7mc (2015) 


Muscle wasting 
blocked in mice 


Giving tumour-bearing mice 
specific proteins prevents a 
muscle-wasting syndrome that 
commonly affects people with 
cancer. 

Many patients with cancer 
die from severe muscle loss 
(cachexia), which has no 
treatment. To find a way to 
halt the condition, Amelia 
Johnston and Nicholas 
Hoogenraad at La Trobe 
University in Melbourne, 
Australia, and their colleagues 
injected mice with mouse 
cancer cells that had been 
engineered to express a 
human gene encoding the 
protein Fn14, which drives 
cancer growth. The animals 
lost muscle and fat, but giving 
the mice an antibody against 


Fn14 stopped cachexia. 
Moreover, in a mouse model 
of cachexia, the animals lived 
longer and maintained body 
weight when treated with an 
anti-Fn14 antibody, compared 
with untreated mice. 
Targeting Fn 14 proteins that 
are generated by tumours could 
be atreatment strategy for this 
condition, the authors say. 
Cell 162, 1365-1378 (2015) 


ASTRONOMY 


The farthest 
galaxy so far 


Astronomers have observed 
the most distant galaxy yet 
by detecting photons emitted 
from its clouds of hydrogen 
when the 13.8-billion-year- 
old Universe was less than 
600 million years old. 

Such photons rarely make 
it to telescopes on Earth, but 
Adi Zitrin at the California 
Institute of Technology in 
Pasadena and his colleagues 
were able to detect them using 
a telescope at the W. M. Keck 
Observatory in Mauna Kea, 
Hawaii. They found that 
the wavelength of arriving 
photons had been stretched 
en route, indicating that the 
galaxy, named EGSY8p7, is 
more than 13.2 billion light 
years (4 billion parsecs) away. 

Seeing hydrogen emission 
from such a distant galaxy 
may challenge current 
understanding of the 
evolution of the Universe, the 
authors say. 

Astrophys. J. Lett. 810, L12 (2015) 


ECOLOGY 


Marauding ants 
bring disease 


One of the most widespread 
invasive ant species not only 
displaces native ants, but also 
carries viruses. 

Phil Lester at Victoria 
University of Wellington 
and his colleagues searched 


RESEARCH HIGHLIGHTS 


THIS WEEK 


SOCIAL SELECTION 


Popular topics 
on social media 


Science failings shared on Twitter 


Researchers’ best success stories end up in journals, but 
many of their less-successful ones found their way on to 
Twitter this week with the hashtag #FailingInSTEM. Tales 
of low points and often-humorous mishaps reassured 
others that failures can be overcome on the way to scientific 
success. “The #FailingInSTEM tweets are so important! 
It's so comforting to know that other scientists make 
mistakes,’ tweeted Aimee Eckert, a PhD student in cell 
biology at the University of Sussex in Brighton, UK. Nicole 
Cabrera Salazar, an astronomy PhD student at Georgia State 
University in Atlanta, started the #FailingInSTEM Twitter 
discussion after a friend of hers suffered a scientific setback: 
“We need to let our young ppl know that regular, fallible 
people do science. We make mistakes everyday. It’s part of 
the job #FailingInSTEM.” She suspected that other young 
researchers could use a reminder that science is not all about 
successful experiments and flashy 


NATURE.COM 
For more on 

popular papers: 
go.nature.com/mzblhl 


for viral sequences in RNA 
extracted from Argentine ants 
(Linepithema humile; pictured) 
in New Zealand. They found 
a virus that they named 
Linepithema humile virus 1, 
which could explain periodic 
crashes in Argentine ant 
populations. They also found 
that the ants carried deformed 
wing virus, which can be fatal 
to honeybees. 

The team suggests that bees 
could become infected when 
the ants forage or raid bee nests. 
Biol. Lett. 11, 20150610 (2015) 


Weyl particles 
discovered 


Three separate teams have 
found analogues of Weyl 
fermions: massless elementary 
particles that were first 
predicted in 1929 but 


publications. “People don't talk about all 
of the times that they broke something 
ina lab or got heckled during a 
presentation,” she says. 


have never been observed. 

Physicists searching for 
these fermions look for their 
unusual properties in the 
collective behaviour of other 
particles. Hong Ding and Tian 
Qian at the Chinese Academy 
of Sciences in Beijing and 
their colleagues saw these 
‘quasiparticles’ by probing a 
sample of tantalum arsenide 
with a beam of X-rays. In 
July, a separate group of 
researchers led by Zahid Hasan 
at Princeton University in 
New Jersey announced that 
they had seen the particles in 
the same material. Ling Lu at 
the Massachusetts Institute 
of Technology in Cambridge 
and his colleagues reported 
seeing signs of the particles in 
the behaviour of light passing 
through a crystal. 

Such experimental systems 
could allow researchers to 
probe the exotic properties 
associated with Weyl particles. 
Phys. Rev. X 5,031013 (2015); 
Science 349, 613-617; 622-624 
(2015) 


> NATURE.COM 

For the latest research published by 
Nature visit: 
www.nature.com/latestresearch 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 293 
© 2015 Macmillan Publishers Limited. All rights reserved 


SEVEN DAYS nescnnss 


Clone-products ban 


The European Parliament has 
voted in favour ofa sweeping 
ban in the European Union 

on the use of food and feed 
products — domestic or 
imported — from cloned 
animals and their descendants. 
The rules tighten a draft 

law proposed in 2013 by 

the European Commission 

to prohibit the sale of food 
products derived from cloned 
animals. Parliamentarians 
said that the proposed 
amendments, voted in on 

8 September, reflect widespread 
consumer concerns over 
animal welfare and food safety. 
But Vytenis Andriukaitis, 

the European commissioner 
for health and food safety, 
called the amendments 
“disproportionate” and warned 
that they might prove “legally 
impossible”. 


Energy review 

The US Department of 
Energy released its second 
Quadrennial Technology 
Review on 10 September, 
which looks at current 
energy technologies and 
identifies opportunities for 
research and development. 
The report suggests that 

the US energy system is 
becoming increasingly diverse 
with the rise of renewable 
power, as well as being more 
interconnected through 


NUMBER CRUNCH 


879 


Total number of days Russian 
cosmonaut Gennady Padalka 
has spent in space, the most 
by any individual. Padalka 
returned from his latest stay 
on the International Space 
Station on 11 September. 


Pluto's ‘heart’ snapped in high resolution 


The left lobe of Pluto's bright heart-shaped 
feature, Tombaugh Regio, is clearly seen in 

the upper right of this image released on 

10 September. The view spans 1,800 kilometres 
and is generated from high-resolution images 


Internet and communications 
technologies; both trends open 
the door to cleaner energy and 
fewer emissions. Although 

the United States has made 
progress on energy efficiency, 
the report notes that substantial 
opportunities remain for 
reducing energy consumption 
and costs. 


Sonar muffled 

The US Navy has agreed to 
limit its sonar and explosives 
activities in areas that might 
harm dolphins, whales and 
other marine mammals. The 
agreement between the navy 
and environmental groups, 
including Earthjustice and the 
Natural Resources Defense 
Council, was ordered by a 
federal judge on 14 September. 
The navy will not be able to 


294 | NATURE | VOL 525 | 17 SEPTEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


use sonar, which can disrupt 
communication between 
marine mammals, in areas off 
the Southern California coast. 
Areas around Hawaii's islands 
are protected from sonar and 
explosives training operations. 


Energy ambitions 
The California legislative 
assembly passed laws on 

11 September that will increase 
requirements for renewable- 
energy production in the state. 
The bill, sought by Governor 
Jerry Brown, raises the current 
renewable-energy quota of 33% 
by 2020 to a more-rigorous 
50% by 2030. It also sets a goal 
of doubling energy efficiency in 
the electricity and natural-gas 
sectors by 2030. An earlier draft 
of the bill would have required 
the state to halve its oil use 


gathered during the 14 July fly-by of Pluto 

by NASA's New Horizons spacecraft. The 
boundary between the bright, icy plains (called 
Sputnik Planum) and dark, cratered terrains 
(called Cthulhu Regio) is particularly striking. 


over the same period, but the 
provision was dropped because 
of feasibility and cost concerns. 


Land undervalued 
Global land degradation 
costs between US$6.3 trillion 
and $10.6 trillion per year, 
according to the Economics 
of Land Degradation (ELD) 
Initiative in Bonn, Germany. 
Ina report published on 

15 September, the ELD 

said that unsustainable 

land management ruins 
productivity and removes 
ecosystem services that have 
no market value, including 
nutrient recycling and 
disease regulation. The loss 
figure includes the costs of 
replacing these services, for 


NASA/JHU APL/SRI 


STEFAN POSTLES/GETTY 


SOURCE: C. HANDKEY ET AL. SOC. SCI. RES. NETW. HTTP://DOILORG/7PX (2015). 


example by buying fertilizers 
or vaccines. People without 
funds to replace services 

such as clean water are 
particularly vulnerable to land 
degradation. 


Permafrost tracked 
The first international 
database of standardized 
permafrost data was launched 
this week by the Global 
Terrestrial Network for 
Permafrost (GTN-P), an 
international consortium 

that aims to establish an 
early-warning system for 
permafrost thawing, for use by 
scientists and policymakers. 
The European Union-funded 
database gathers frozen-soil 
temperatures and annual 
thaw depths. Permafrost has 

a key role in climate-change 
modelling, because when 

it thaws it can release the 
greenhouse gases carbon 
dioxide and methane. 


EVENTS 


Turnbull coup 
Malcolm Turnbull has ousted 
his fellow Liberal party 
member Tony Abbott as 
Australian prime minister, 
after forcing a party ballot 

on 14 September. Turnbull 
(pictured) won the leadership 
vote by 54 votes to 44. On 

his first day, Turnbull said 
that he would keep current 
climate-change policies for 
now. In 2014, Abbott repealed 


TREND WATCH 


Researchers in countries that 


often give academic exemptions 
from copyright laws, such as 
the United States, publish more 


data-mining studies than in places 


such as Germany and France, 
where academics must first gain 
consent. The work, presented at 
the 2 September European Policy 
for Intellectual Property meeting 
in Glasgow, UK, analysed 18,441 
articles on data mining, a method 
that allows investigators to trawl 
large data sets for discoveries 

(C. Handkey et al. Soc. Sci. Res. 
Netw. http://doi.org/7px; 2015). 


Australia’s carbon tax and 


scrapped an emissions- 
trading scheme, disappointing 
Australian climate scientists. 
In 2009, Turnbull was 
overthrown as party leader 

by Abbott, in part because 

of Turnbull's support for an 
emissions-trading scheme. 


Pesticide repealed 


A US appeals court has 
rescinded the approval 

by the Environmental 
Protection Agency (EPA) 

of an insecticide named 
sulfoxaflor. The approval 
was challenged by a group of 
bee-keeping organizations, 
which cited evidence that 
sulfoxaflor — a neonicotinoid 
compound with an unusual 
mechanism — is highly toxic 
to bees. The court ruled 

on 10 September that the 
EPAs decision was based 

on ‘flawed and limited data. 
Dow AgroSciences, which 
won approval for sulfoxaflor 
in 2013, is considering 
challenging the ruling. 


| _BUSINESS 
Phage trial starts 


A European company has 
started the first randomized 
clinical trial using bacterium- 
killing viruses called phages. 
On 9 September, Pherecydes 
Pharma, based in Romainville, 
France, announced that 

it had begun enrolling 

people with burns who are 
susceptible to infection by 

the bacteria Escherichia coli 
and Pseudomonas aeruginosa. 
The company will test two 
cocktails of phages that attack 
these species in trials involving 
220 people in Switzerland, 
France and Belgium. Phage 
therapy has been used for 
decades in Eastern Europe, 
but has not yet been tested ina 
large, controlled trial. 


| FUNDING 
Boost for Africa 


Researchers in African 

nations will share more than 
£46 million (US$70 million) 

in a programme to build 
science capacity. The first 
seven ‘Developing Excellence 
in Leadership, Training and 
Science (DELTA) awards were 
announced on 10 September. 
They range from mental-health 
research in Zimbabwe to 
science-leadership training in 
Kenya. Funded by the London- 
based biomedical charity the 
Wellcome Trust, together 

with the UK Department for 


COPYRIGHT IMPACTS ON DATA MINING 


In countries where copyright restrictions are relaxed for researchers 
mining large data sets, more data-mining articles are published. 


panacea a taneensunsaienstncateansansesensinns 


Bf Data mining not allowed 
© Data mining probably allowed 


published in 2003-14 
ny w 


me 


Proportion of data-mining articles 


France 


United 
States 


Germany 


Taiwan Singapore 


SEVEN DAYS | THIS WEEK | 


17-18 SEPTEMBER 
Scientists from across 
academia and industry 
gather in Nairobi to 
discuss biotechnology 
and biomedical science 
in Africa. 
http://aibbc.org/ 


17-21 SEPTEMBER 
The 30th International 
Papillomavirus 
Conference convenes 
in Lisbon, offering 
workshops in clinical 
and public health. 
http://www.hpv2015.0rg/ 


21-25 SEPTEMBER 
Sandbjerg, Denmark, 
hosts the 3rd 
International Workshop 
on Microbial Life 

under Extreme Energy 
Limitation. 


http://microenergy2015.org/ 


International Development 
and the Bill & Melinda Gates 
Foundation, the DELTAs 
programme will from 2016 be 
managed by the newly formed 
Alliance for Accelerating 
Excellence in Science in Africa 
(see Nature 520, 142-143; 
2015). 


New sponsor sought 
Intel will drop its support for 
the oldest and most prestigious 
science contest for US high- 
school students, Science Talent 
Search, after the 2017 event. 
Eight Nobel prizewinners, 

five winners of the National 
Medal of Science and many 
other scientists are alumni of 
the 73-year-old programme, in 
which students are judged on 
original research. The Society 
for Science & the Public in 
Washington DC, which runs 
the competition, announced 
on 9 September that it was 
looking for a new sponsor after 
Intel pulled its US$1.6-million 
annual backing, begun in 1998. 


> NATURE.COM 
For daily news updates see: 
Www.nature.com/news 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 295 
© 2015 Macmillan Publishers Limited. All rights reserved 


NEWSINE 


NIH conflict-of- LIGO resumes Security 4 
interest rules get poor hunt for gravitational cameras capture 86 4 
reviews p.300 waves p.301 new meteor showers p.302 


The most 
~ interdisciplinary fields and 
countries are revealed p.306 


WITS UNIVERSITY 


Lee Berger (front) recruited a team of wiry excavators to retrieve more than 1,500 fossils from the Dinaledi chamber in South Africa. 


PALAEOANTHROPOLOGY | 


Crowdsourcing digs up an 
early human species 


Palaeoanthropologist asks excavators and anatomists to study Africa’s richest fossil trove. 


BY EWEN CALLAWAY 


CC ear colleagues — I need the help 
D of the whole community,” palaeo- 
anthropologist Lee Berger posted 

on social media on 6 October 2013. 

Berger, based at the University of 
Witwatersrand in Johannesburg, South Africa, 
had just learnt of a small underground cham- 
ber loaded with early human fossils. He was 


looking for experienced excavators to collect 
the delicate remains before they deteriorated 
further. “The catch is this,” Berger went on. 
“The person must be skinny and preferably 
small. They must not be claustrophobic, they 
must be fit, they should have some caving 
experience.” 

Less than two years after he posted this 
missive, Berger and his team have pieced 
together more than 1,500 ancient human bones 


and teeth from the Rising Star cave system — 
the biggest cache of such material ever found in 
Africa. The remains belong to at least 15 indi- 
viduals of a previously undescribed species that 
the team has dubbed Homo naledi, and they 
may mark the oldest-known deliberate burial in 
human history, Berger and his colleagues report 
in eLife(L. R. Berger et al. eLife 4, 09560; 2015 
and P. H.G. M. Dirks et al. eLife 4, 09561; 2015). 
For Berger, the research marks a milestone in 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 297 


© 2015 Macmillan Publishers Limited. All rights reserved 


| NEWS IN FOCUS 


> acampaign to transform palaeoanthropol- 
ogy into an open and inclusive field, in which 
rare fossils are rapidly shared with the scientific 
world instead of being squirrelled away as an 
elite few scrutinize them for years. 

“There's lots of fossils out there no one 
has ever seen, except for a few select people. 
Palaeoanthropology is really rotten that way,” 
says Tracy Kivell, a palaeoanthropologist at the 
University of Kent in Canterbury, UK, who 
analysed hand bones from Rising Star and is a 
co-author of the paper that describes H. naledi. 
“Lee is changing that and setting a new stand- 
ard for what we should expect” 

Palaeoanthropologist Denné Reed of the 
University of Texas at Austin sees Berger’s 
openness as part ofa generational shift in the 
field. “We're more interested in openly sharing 
data,” he says. “The advantages in collabora- 
tion far outweigh any of the risks.” 

A few weeks before Berger advertised for 
help, cavers who work with him had discov- 
ered the Dinaledi Chamber in the Rising Star 
cave system, about 50 kilometres northwest 
of Johannesburg. Berger hoped to remove the 
remains as soon as possible, but he needed 
help. The narrow chamber is about 30 metres 
below ground, and the only access is through a 
slit in the rock some 20 centimetres wide (see 
“Tough commute’). “I’m not physiologically 
appropriate to ever get in the system,” he says, 
referring to his large build. A month after his 
social-media post, Berger had six scientists at 
work in the cave. 


TREASURE TROVE 

As he and other colleagues watched a video 
feed in a nearby tent, the excavators pulled up 
skulls, femurs, teeth and hundreds of other 
specimens. “By the end of this expedition, we 
had recovered more individual remains than 
had been discovered in South Africa in the last 
90 years,’ says Berger. 

The research team with which Berger usu- 
ally works to analyse early human remains was 
busy making sense of other fossils, discovered 
in 2008 at the nearby Malapa site. So Berger 
put out another social-media call, this time 
recruiting more than 30 early-career scien- 
tists to attend a month-long workshop to 
analyse and describe the fossils. 

John Hawks, a palaeoanthropologist 
at the University of Wisconsin—Madison 
who helped to coordinate the Rising Star 
dig and workshop, says that the team took 
flak for its unorthodox approach. “There's a 
lot of the field that really believed we're just a 
couple of cowboys who don’t know how things 
should be done,” he says. 

The team intends to publish at least a dozen 
papers from the workshop in the coming 
months; the two published today are the first. 
They describe the discovery site and the anat- 
omy of H. naledi, the skull of which encased a 
small, fist-sized brain much like those of other 
early members of the genus Homo and of the 


TOUGH COMMUTE 


To access the fossil-rich Dinaledi Chamber, six slender 
scientists had to penetrate deep into the Rising Star cave 
system, squeezing themselves through a passage dubbed 
Superman Crawl because cavers have to extend one arm 
overhead — like the flying superhero — to get through. 


12-m 
vertical shaft 


Dinaledi 


Chamber Dragon’s 


Back 


more ancient australopiths. In other ways, its 
body is more like those of modern humans, 
with the lower limbs and feet of a biped and 
hands that could have gripped tools with pre- 
cision. The researchers estimate that H. naledi 
would have stood just under 1.5 metres tall and 
weighed between 40 and 55 kilograms. 

“Tt is a very strange combination of features, 
some that we've never seen before and some 
that we would have never expected to find 
together,’ says Hawks. 


FAMILY RESEMBLANCE 

The researchers are unclear about how H. nal- 
edi is related to other early human species 
that lived in Africa, such as Homo erectus 
and Homo habilis. They hope to date calcite 


Homo naledi’s skull looks like an australopithecine’s. 


298 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


Superman 


Rising Star 


Johannesburg 


South Africa 


1,480 m 
— above sea level 


deposits in the cave to establish the age of the 
remains, which could be more than 1 million 
years old. 

The chamber contains no evidence that early 
humans lived there, and no bones from species 
other than H. naledi, so Berger believes that it 
might be a deliberate burial, and possibly the 
oldest known. Currently, the oldest site that 
seems to represent an early human burial is 
Sima de los Huesos in the Atapuerca Moun- 
tains of Spain, which dates to 430,000 years ago. 

Fred Spoor, a palaeontologist at University 
College London, agrees that the bones rep- 
resent a previously unknown Homo species, 
and says that Berger's team makes a good case, 
after considering other alternatives, that the 
remains were deposited intentionally. He is 
eager to see what other experts make of it. 

However, Jeffrey Schwartz, an evolution- 
ary biologist at the University of Pittsburgh in 
Pennsylvania, thinks that the material is too 
varied to represent a single species. “I could 
show those images to my students and they 
would say that they're not the same,” he says. 
One of the skulls looks more like it comes from 
an australopithecine, he adds, as do certain 
features of the femurs. 

Schwartz and others will soon get the 
chance to judge the Rising Star remains for 

themselves. Berger’s team has uploaded data 

including 3D scans of the remains to the 

MorphoSource repository, and welcomes 
other researchers to study the material at 
first hand. Berger did the same with remains 
ofa species called Australopithecus sediba that 
were discovered at the Malapa site. 

Schwartz says that he has had trouble 
accessing some researchers hominin remains 
even after they had been described in a jour- 
nal. But when he asked Berger’s team if he 
could purchase A. sediba casts several years 
ago, he got them for free. “How good can you 
be?” says Schwartz. “It’s been refreshing and 
delightful that Lee Berger has always made his 
specimens accessible.” = 


SOURCE: P. H. G. M. DIRKS ETAL. ELIFE 4, 09561 (2015) 


WITS UNIVERSITY 


GLOBAL HEALTH 


IN FOCUS | NEWS 


Africa braced for snakebite crisis 


Health specialists warn that stocks of antivenom willrun out in 2016. 


BY QUIRIN SCHIERMEIER 


ural Africa is facing a resurgence of 
R a persistent plague that rarely makes 
headlines: snakebite. 

By June next year, stockpiles of the anti- 
venom that is most effective against Africa's 
vipers, mambas and cobras are expected to run 
out because the only company that makes the 
medicine has stopped production. With no 
adequate replacement in sight, the death toll 
from bites is set to rise, specialists warned at a 
tropical-medicine congress last week in Basel, 
Switzerland. 

“We're dealing with a neglected health crisis 
that is turning into a tragedy for Africa,” says 
Gabriel Alcoba, a medical adviser with the 
international humanitarian group Médecins 
Sans Frontiéres (MSF; also known as Doctors 
Without Borders). 

Poisonous snakes might seem an archaic 
menace in such a rapidly urbanizing world. 
Yet by cautious estimates, snakebites kill more 
than 100,000 people worldwide every year 
(see ‘Death toll’) — more, on average, than 
lose their lives in natural disasters. And survi- 
vors often experience permanent physical and 
mental disabilities. 

In 2010, the French drug firm Sanofi Pasteur 
in Lyon ceased production of Fav- Afrique, an 
antibody serum that reduces the quantity of 
venom circulating in the blood ofa snakebite 
victim. Made from the purified plasma of horses 
previously injected with small quantities of 


~ 


The deadly carpet viper (Echis ocellatus). 


snake venom, the serum neutralizes the venom 
of many of Africa's most dangerous snakes. 
The antidote has saved many people from 
bites by deadly species such as the carpet viper 
(Echis ocellatus), common in West Africa, 
and the black mamba (Dendroaspis polylepis), 
found across the sub-Saharan region. But the 
high costs — US$250-500 per person — and a 
supply shortage mean that only about 10% of 
snakebite victims in Africa get treatment, and 
the company says that producing the antidote is 
no longer profitable. Cheaper products by com- 
petitors have forced Sanofi Pasteur out of the 
African market, says Alain Bernal, a company 


DEATH TOLL 


Fuzzy estimates of snakebite fatalities 


It is uncertain how many people are bitten or 
die from snakebites in sub-Saharan Africa. 
But according to Médecins Sans Frontiéres 
(MSF; also known as Doctors Without 
Borders), whose health-care workers treat 
snakebites through field programmes in the 
Central African Republic and South Sudan, an 
estimated 30,000 people die each year and at 
least 8,000 more undergo amputations. 

But snakebite mortality could be much 
higher than anecdotal reports suggest. For 
some countries, including the Democratic 
Republic of Congo — home to an enormous 
number of venomous snakes — there are no 
reliable data, says tropical-medicine specialist 
David Warrell at the University of Oxford, UK. 


Under-reporting is not limited to Africa. 
The authors of a nationally representative 
snakebite-mortality survey, published in 
2011, deduced that, despite the availability 
of antidotes, around 46,000 people in India 
die of snakebites every year (B. Mohapatra 
et al. PLoS Negl. Trop. Dis. 5,e1018; 2011). 
India’s Central Bureau of Health Intelligence 
reported merely 1,219 and 985 fatal bites 
for 2009 and 2010, respectively. One 
reason for the discrepancy, says Warrell, 
who co-authored the study, is that many 
victims of snakebites die before they reach 
a hospital, or waste precious time with 
traditional healers before seeking more- 
conventional medical help. 0.8. 


spokesman. Sanofi Pasteur is working to enable 
the transfer of know-how to companies willing 
to take over production of Fav-Afrique, he says. 

Pharmaceutical companies in South Africa, 
India, Mexico and Costa Rica are among those 
marketing cheaper products — some of which 
work well against snakes in their host nations. 
But their safety and effectiveness against the 
large variety of species in Africa have not yet 
been established in clinical trials. To speed up 
the process, MSF is offering two of its hospi- 
tals in the Central African Republic (CAR) 
and South Sudan as study sites. But it will take 
at least two years to validate the products in 
development, and none is as broadly efficient 
as Fav-Afrique, Alcoba says. 


NEGLECTED THREAT 

Although just now becoming critical, Africa's 
snakebite problem has been smouldering for 
years, says tropical-medicine specialist David 
Warrell of the University of Oxford, UK, who 
consults for the World Health Organization 
(WHO). Snakebite fatalities have been rising 
over the past decade in the CAR, Ghana and 
Chad — in part owing to a failure to train 
enough medical staff, ignorance from health 
ministries and “unscrupulous marketing” of 
inappropriate antivenoms, he says. “War-torn 
countries have many other problems. But 
the millions of children, poor farmers and 
nomadic people at risk of snakebites just don't 
have the ear of politicians in capital cities” 

And according to Warrell, the WHO has 
done little to help. To improve the safety and 
efficacy of antibodies, the agency has released 
guidelines for producing antivenoms. But 
it has no formal programme for improving 
treatment by training medical workers, advis- 
ing ministries or educating communities, as it 
does for 17 other neglected tropical diseases, 
including dengue and sleeping sickness. And 
yet, says Warrell, snakebites cause more deaths 
than do all 17 diseases put together. 

Warrell says that, while waiting for clinical 
trials to bring replacements for Fav- Afrique 
to the market, the keys to reducing the risk of 
snakebite are education and preventive meas- 
ures — such as wearing proper shoes, using 
a light when walking home from the fields 
and sleeping above ground level, beneath a 
mosquito net. 

Thankfully, says Alcoba, the global-health 
community is starting to grasp the urgency of 
the situation. “People used to laugh when we 
talked about snakebites,” he says. “They don't 
laugh any more.” = 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 299 


© 2015 Macmillan Publishers Limited. All rights reserved 


PAUL STAROSTA/CORBIS 


| NEWS IN FOCUS 


NIH disclosure rules falter 


Regulations that require researchers to disclose conflicts of interest yield questionable data 


and cost universities millions. 


BY SARA REARDON 


hen a US Senate investigation in 

2008 revealed that psychiatrist 

Charles Nemeroff of Emory Uni- 
versity in Atlanta, Georgia, had not disclosed 
at least US$1.2 million in income from drug 
companies, Senator Charles Grassley decided 
to do something about it. The Iowa Republi- 
can led a charge to push the National Institutes 
of Health (NIH), which funded Nemeroff’s 
research, to change how it evaluates research- 
ers who accept money from industry. 

The resulting reforms, which took effect 
in 2012, require scientists to report industry 
connections in greater detail than before, and 
charge institutions with determining which 
ties are problematic. But three years later, 
it is not clear what the costly, cumbersome 
rules have accomplished. A Nature analysis 
suggests that institutions have vastly differ- 
ent standards for what constitutes a conflict 
— and that they classify relatively few rela- 
tionships between researchers and industry 
as troublesome. 

“There’s a lot more financial conflict of inter- 
est in my view than the NIH is getting from the 
reports of universities,’ says Sheldon Krimsky, 
who studies conflict-of-interest issues at Tufts 
University in Medford, Massachusetts. “We're 
just seeing the tip of the iceberg” 

The reforms, enacted by the NIH’s parent 
agency, the Department of Health and Human 
Services (HHS), do seem to have increased 
the number of financial relationships that 
researchers report to their universities — by 
45% overall, according to data from 56 uni- 
versities in a survey released in April by the 
Association of American Medical Colleges 
(AAMC) in Washington DC (see go.nature. 
com/hc5r2b). But the number of conflicts that 
institutions reported to the NIH has increased 
only slightly, according to NIH data obtained 
by Nature through a freedom-of-information 
request (see “Under the microscope’). 

The agency’s original conflict-of-interest 
regulations, implemented in 1995, required 
institutions to report when an HHS-funded 
researcher received more than $10,000 from an 
outside source. The revised rule lowered that 
threshold to $5,000 and directed researchers to 
disclose a wider variety of potential conflicts, 
such as sponsored travel and relationships with 
non-profit organizations. 

Institutions, which receive conflict-of-interest 


UNDER THE MICROSCOPE 


Through a freedom-of-information request, Nature obtained conflict-of-interest reports submitted to 
the US National Institutes of Health (NIH). For more on our methodology, see go.nature.com/11pjj6 


OUTLOOK HAZY 


Data from the NIH, which cover the period from August 2012 to May 2015, suggest that the number of 
conflicts of interest that an institution reports does not always reflect the number of grants that it receives. 


e University of California, 
(fy San Francisco e 
z 
G 3. ak ®. 
a e 
= ° °° ee 
a & > Stanford University 
EQ rman 
Ey e 
ie e : 
iS) ® 
a ofgoe 
Q 1... 90.8 “eer e 
5 a é 


(0) 20 40 60 


® Pennsylvania State University Hershey Medical Center 


st wayene Hopkins University sects otestca era oet 


Yale 
Massachusetts © University 
General Hospital 


e 
University of 
Wisconsin—Madison 


University of Texas, Austin 


80 100 120 140 


Number of financial conflict-of-interest disclosures 


SMALL CLAIMS 


Institutions reported 2,523 financial conflicts of interest between January 2013 


and May 2015. Most were relatively small. 


0-4.9 
5-9.9 
10-19.9 
20-39.9 
40-59.9 
60-79.9 
80-99.9 
100-199.9 
200-299.9 
300-399.9 
400-499.9 
500-599.9 
600-999.9 
1,000-5,999.9 
6,000+ 
Unknown 


31 claims involved 

$1 million or more. The 
largest confirmed by 
Nature was $13 million. 


Cost of claim (US$, thousands) 


0) 200 


400 


HHS reforms lowered the 
reporting threshold from 
$10,000 to $5,000. 


Determining the financial 
value of some relationships, 
such as a stake in a start-up 
company, can be difficult. 


600 800 1,000 


Number of financial conflict-of-interest disclosures 


reports from their researchers annually, must 
then convene an internal panel to determine 
whether a particular relationship could affect 
a researcher's work. If so, the panel designs 
a ‘management plan’ that may require the 
researcher to disclose the conflict in publica- 
tions or, in some cases involving human sub- 
jects, to stand down as the study’s primary 
investigator. Institutions then send these plans 
to the NIH. 

Universities have spent millions of dollars 
and hired extra staff to comply with these 
reforms, and most administrators are furious 
about the burden. “We already had an annual 
disclosure process for all the faculty,” says 


300 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


Andrew Rudczynski, associate vice-president 
for research administration at Yale University 
in New Haven, Connecticut. “I can't see a 
single benefit to it?” 

Yale spent $500,000 to implement the 
revised NIH rules. In the year after they took 
effect, the number of disclosures by the univer- 
sity’s researchers doubled — but Yale identified 
just one new conflict, Rudczynski adds. Other 
universities report similar experiences. 

And whereas the HHS had estimated that the 
roughly 2,000 institutions that it funds would 
spend $23.2 million a year to comply with 
the regulations, the AAMC survey suggests 
that the true cost has been much higher. Just 


SOURCE: NIH 


71 institutions spent a total of $23 million in 
the year after the reforms took effect, although 
their costs going forward may be lower. 

Paul Thacker, who led the 2008 Senate inves- 
tigation as a member of Grassley’s staff, admits 
that it is difficult to know how well the reforms 
are working. That is largely because the poten- 
tial benefits of greater disclosure of financial 
ties, such as peer reviewers giving closer scru- 
tiny to studies by researchers with conflicts, are 
tough to measure. 

Still, Thacker says, there is a clear need 
for closer scrutiny. This is backed up by evi- 
dence showing that studies funded by pri- 
vate sources, such as drug firms, more often 
produce results that benefit the funder than 
do publicly funded studies (A. Lundh et al. 
Cochrane Database Syst. Rev. 12, MR000033; 
2012). And Thacker has little sympathy for 
universities’ complaints. “It just shows that 
they still don’t get what the problem is,” he 
says. “They’re in this place today because 
they've failed to create confidence for the 
public in the past.” 

Others worry that the HHS policy is still not 
strict enough. Krimsky says that the current 
rules may give institutions too much power to 
assess conflicts, without accounting for ways 
that universities themselves can be compro- 
mised by ties to government or industry. This 
could be one reason why the HHS reforms 
did not significantly increase the number of 
reported conflicts, Krimsky adds. 

Those pushing for greater transparency 
are also frustrated that the NIH does not 
require institutions to publish information 
about researchers’ conflicts and management 


plans online. Instead, members of the public 
must ask a university for information on a 
researcher’s conflicts; the institution has five 
days to disclose dollar amounts and sources. 
Nonetheless, the NIH Office of Extramural 
Research says that about 50% of institutions 
that submit conflict-of-interest reports have 
voluntarily created online databases, although 
these vary in usability and completeness. 
Requesting such information from universi- 
ties directly also produces mixed results. Nature 
contacted 20 public and private institutions 
that had reported individual researchers with 
conflicts of interest involving more than $1 mil- 
lion, seeking details 
on these relation- 


(t9 e 
Pc hepehe ships. The majority 

’ y of these institutions 
of what’s 


responded immedi- 
ately, but some took 
as long as two weeks 
to respond, directed Nature’s reporter to the 
media office, or instructed her to submit a 
freedom-of-information request. Most 
declined to share information about conflicts 
that occurred before the current calendar year, 
which is not required by the HHS. 

Nor does the department require the release 
of management plans, which troubles Tobin 
Smith, vice-president for policy at the Asso- 
ciation of American Universities in Washing- 
ton DC. “If you disclose that there is a conflict 
but don't disclose how the university is man- 
aging it — which is not part of the regulations 
— the public doesn’t understand the relation- 
ship,” he says. 

The NIH also struggles to defend its own 


disclosed to us.” 


IN FOCUS | NEWS 


regulations. “One could debate whether or 
not we needed to promulgate a new rule,” 
says Sally Rockey, director of the NIH Office 
of Extramural Research. “At the time, there 
was a lot of scrutiny in the press and Congress 
got involved.” She concedes that the reforms 
were mostly in response to this outside pres- 
sure. (Grassley declined to comment on the 
regulations.) 

And it is unclear whether the revised regu- 
lations would have identified Nemeroff, who 
did not tell Emory about his industry rela- 
tionships. “Science and research are built on 
trust, and we are still at the mercy of what’s 
disclosed to us,’ says Eric Mah, senior direc- 
tor of research compliance at the University of 
California, San Francisco. 

The NIH plans to review the conflict-of- 
interest reforms later this year, to develop 
best practices for compliance. The agency 
will examine data on the type and number of 
reported conflicts, as well as institutions’ expe- 
riences of complying with the requirements. 
But Rockey says that the HHS is unlikely to 
make significant changes to the rules, given 
that they took four years to develop. 

In the meantime, research institutions are 
caught in a bind. The 1980 law that allows US 
universities to patent inventions encourages 
relationships with industry, and tight federal 
research budgets are driving more scientists 
to seek support from private funders. “There 
are no easy answers,’ Thacker says. “Univer- 
sities are being pushed into greater reliance 
on industry funding and until that reverses, 
these problems just become more and more 
complicated.” = 


Hunt for cosmic waves to resume 


Upgraded LIGO detectors willimprove chances of finding ripples in space-time. 


BY DAVIDE CASTELVECCHI 


Imost 100 years after Einstein presented 
At: general theory of relativity in a 

Berlin lecture theatre, the quest to spot 
the gravitational waves he predicted may be 
entering its final stages. 

This week, the world’s largest gravitational- 
wave facility is expected to start collecting data 
again after a 5-year US$200-million overhaul. 
The Laser Interferometer Gravitational-Wave 
Observatory (LIGO) searched fruitlessly for 
these cosmic ripples for almost a decade in 
the 2000s. But the odds that its improved ver- 
sion — known as Advanced LIGO — will detect 
any waves in the next three months may be as 
high as one in three, according to some of the 
physicists involved in the experiments. 


Initial tests have shown that the observatory’s 
twin detectors, in Washington state and Louisi- 
ana, are performing as expected, says Gabriela 
Gonzalez, spokesperson for the 900-strong 
LIGO Scientific Collaboration. And that is 
no mean feat for an instrument that has cost 
$620 million so far. “It’s the first time that any- 
thing in this field is on budget and on sched- 
ule,” says Karsten Danzmann, director of the 
Max Planck Institute for Gravitational Physics, 
in Hannover, Germany, who is not part of the 
LIGO management team. 

According to general relativity, gravitation 
originates from the interplay between massive 
objects and the malleable fabric of space-time. 
Einstein predicted that accelerating masses such 
as colliding neutron stars or black holes would 
disturb that fabric and produce gravitational 


ripples that propagate through the Universe. 

Each of LIGO’s detectors is designed to meas- 
ure the deformation of space-time by compar- 
ing changes in the paths of laser beams that race 
down its two perpendicular 4-kilometre-long 
arms, bounce between mirrors and interfere 
with each other back at their source. When a 
gravitational wave passes through, it slightly 
alters the lengths of the arms, and the obser- 
vatory can spot such changes with a sensitiv- 
ity of one part in 10”. That is comparable to a 
hair’s-width change in the distance from the Sun 
to Alpha Centauri, its nearest star, says Laura 
Cadonati, a physicist at the Georgia Institute of 
Technology in Atlanta who will be coordinating 
the experiment's data analysis. 

A crucial part of the improvement is better 
damping of the vibrations caused by > 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 301 


© 2015 Macmillan Publishers Limited. All rights reserved 


JOHN LAWRENCE 


PHOTOGRAPHY/ALAMY 


| NEWS IN FOCUS 


> less-than-heavenly sources. The problem 
was especially acute at the site in Livingston, 
Louisiana, which is in the middle of a timber 
plantation. Any felling of trees would disturb 
the detector, so it could keep its laser beams ‘in 
lock — vibrating at precise frequencies — only 
at night or on weekends. A passing train would 
knock the site out for an hour, says physicist 
Brian O'Reilly, who will coordinate the follow- 
up of detections at the Livingston site. But now, 
he says, the detector should be able to take data 
over several days at a time without interruption. 

Advanced LIGO is already three times 
more sensitive than its predecessor, but in 
three months’ time it will shut down for more 
improvements that will make it ten times more 
sensitive. When it reopens around 9 months 
later, it should be able to spot cosmic ripples 
from cataclysmic events — such as the colli- 
sions of black holes — up to 120 megaparsecs 
(326 million light years) away on a regular 
basis and sample a volume of space 1,000 times 
greater than the original observatory. 

Next year, LIGO will be joined by a slightly 
smaller €200-million (US$226-million) Franco- 
Italian detector near Pisa, Italy, called Advanced 
Virgo, which is undergoing its own upgrade. 
The LIGO and Virgo teams will pool their data 
to check each other’s detections. They expect 
to see waves from mergers of binary neutron 
stars — events that should generate strong, pre- 
dictable signals — but do not know precisely 
how many to anticipate. “It could be, depending 
on the models, ten binary neutron star detec- 
tions a year or so,’ Gonzalez says. “But it could 
be 10 times higher or 100 times lower” 

“The first detections will be quite dramatic 
for us.’ says Rainer Weiss, a theoretical physi- 
cist at the Massachusetts Institute of Technology 
in Cambridge who was one of LIGO’s found- 
ers. “The first thing we will need to sort out is 
whether we truly believe what we are seeing” 

Having detectors on different continents is 
crucial for providing a rough estimate of the 
origin of the waves, says Fulvio Ricci, a physi- 
cist at the Sapienza University of Rome and the 
spokesperson for Virgo. Once they know that, 
astronomers will be able to look for other signs 
of that event using electromagnetic radiation, 
such as X-rays or visible light. 

Einstein published his first papers on gravita- 
tional waves in 1916. Detecting these ripples a 
century later, Weiss says, would be of “enormous 
symbolic importance”. = 


S INTERVIEW 


MORE - 
ONLINE = =~ 


BY ALEXANDRA WITZE 


he list of meteor showers that occur 

| every year has just grown longer. 
Eighty-six previously unknown 
showers have now joined the regular spec- 
taculars, which include the Perseids, Leo- 
nids and Geminids. Astronomers spotted 
the shooting-star shows using a network 
of video cameras designed to watch for 


| MORE NEWS | 
Why marine @ California snowpack lowest in past 
life needs 500 years go.nature.com/c2gul3 
protection @ Scientists trial humane shark 
from noise deterrents go.nature.com/uqlhll 
pollution @ Southern Ocean sucks up more 
go.nature.com/ carbon dioxide than was thought 
qdldtz go.nature.com/wfh4vh 


302 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


A meteor (upper left) streaks through the Orion constellation during the Perseid shower. 


ASTRONOMY 


Dates added to 
meteor calendar 


Skywatching cameras spot 86 previously unknown events. 


burglars, but repurposed to spy cosmic debris 
burning up in Earth’s atmosphere. 

The newfound showers are faint but impor- 
tant: each is fuelled by Earth’s passage through 
a trail of particles left behind by a comet or 
asteroid, so mapping them reveals previously 
unknown sources of dust. 

“The cool thing is, we are not just doing 
surveillance of meteors in the night sky,” 
says Peter Jenniskens, an astronomer at the 


NATURE PODCAST 


Camouflaging 
drug delivery; 
science meets 
theatre; and the 
health impacts of 
air pollution nature. 
com/nature/podcast 


4 


BABAK TAFRESHI/NATIONAL 


GEOGRAPHIC CREATIVE 


SETI Institute in Mountain View, California. 
“Now we also have a three-dimensional pic- 
ture of how dust is distributed in the Solar 
System.” 

Most of the particles are the size of a sand 
grain, but a few are large enough to survive 
the searing heat of their passage through 
the atmosphere — and possibly do damage 
on Earth’s surface. Jenniskens and his col- 
leagues describe the discoveries in four papers 
accepted for publication in Icarus. 

Astronomers have been documenting 
meteors for centuries, first by eye and more 
recently with radar and video-tracking systems. 
Meteors sprinkle Earth steadily throughout the 
year, but during a shower a significant num- 
ber seem to originate from the same point in 
the sky. Skywatchers around the world have 
reported more than 750 possible meteor show- 
ers to the International Astronomical Union 
(LAU) — but only a small fraction of those have 
been confirmed as bona fide events. 


SKY SURVEILLANCE 

Jenniskens’ team set up cameras at three loca- 
tions in northern California to confirm or 
rule out these rumoured showers. The Cam- 
eras for Allsky Meteor Surveillance (CAMS) 
project points 60 security cameras in different 
directions to capture as many shooting stars as 
possible. Each has a relatively narrow field of 


view, but together they cover a broad dome of 
sky centred directly overhead and extending 
down to 30° above the horizon. 

“CAMS is about getting massive data sets 
on meteors, so you can see through all the 
scatter to get at those new showers,’ says 
Phil Bland, a planetary scientist at Curtin 
University in Perth, Australia. He helps to 
run a tracking network in the Australian 

outback that looks 


“The more for extremely bright 
we sample meteors in an effort 
the sky, the to recover meteorites 
more detailed ge aioe 

our picture ince it began in 
son . 2010, CAMS has 


measured more than 

250,000 meteors. 
Of those, about three-quarters were random 
singletons and one-quarter came in showers. 
CAMS has confirmed 81 showers that were on 
the IAU’s questionable list, and discovered 86 
new ones. 

Among these is one that lights up South- 
ern Hemisphere skies in early December, and 
seems to radiate from the constellation Vela. 
It is surprisingly strong for a shower that had 
not been noticed before, says Jenniskens. 
During the March 2013 peak of a newly con- 
firmed shower, skywatchers saw the bright 
flash of a rock-sized object hitting the Moon. 


IN FOCUS | NEWS 


The CAMS team has been expanding its 
search by setting up smaller camera networks 
in the Netherlands and New Zealand. “The 
more we sample the sky,’ says Jenniskens, “the 
more detailed our picture becomes of what is 
coming in. = 


CORRECTIONS 

The News story ‘Encryption faces quantum 
foe’ (Nature 525, 167-168; 2015) incorrectly 
named the location for the cryptography 
workshop that began on 6 September. The 
workshop was held at the Schloss Dagstuhl- 
Leibniz Centre for Informatics in Wadern, 

not the Leibniz Center for Informatics in 
Oktavie-Allee. 

The News Feature ‘Fishing for the first 
Americans’ (Nature 525, 176-178; 2015) 
incorrectly credited the photo taken at 
Cooper’s Ferry. Credit should have gone to 
Hayden Wilcox, not Joanne McSporran. 

The News story ‘Health study set to decide 
data policy’ (Nature 525, 16-17; 2015) 
incorrectly stated that an NIH working group 
planned to create a blanket data-sharing 
policy for the Precision Medicine Initiative. 

It is in fact developing a policy that can 
accommodate participants’ varying interest 
in seeing their own genetic information. 


© 2015 Macmillan Publishers Limited. All rights reserved 


cor 


c ge, 
<> Par 

aE > 

Py ed 


> 
if 


te, 


9. 


GZ 
Lp 


B SPECIAL 


ANISCIPLIARITY 


Scientists must work together to save the world. A special 
issue asks how they can scale disciplinary walls. 


ety — energy, water, climate, food, health 

— scientists and social scientists must 
work together. But research that transcends 
conventional academic boundaries is harder 
to fund, do, review and publish — and those 
who attempt it struggle for recognition and 
advancement (see World View, page 291). 
This special issue examines what governments, 
funders, journals, universities and academics 
must do to make interdisciplinary work a joy 
rather than a curse. 

A News Feature on page 308 asks where the 
modern trend for interdisciplinary research 
came from — and finds answers in the prolif- 
eration of disciplines in the twentieth century, 
followed by increasingly urgent calls to bridge 
them. An analysis of publishing data explores 
which fields and countries are embracing inter- 
disciplinary research the most, and what impact 


T o solve the grand challenges facing soci- 


such research has (page 306). On page 313, Rick 
Rylance, head of Research Councils UK and 
himself a researcher with one foot in literature 
and one in neuroscience, explains why interdis- 
ciplinarity will be the focus of a 2015-16 report 
from the Global Research Council. Around the 
world, government funding agencies want to 
know what it is, whether they should they invest 
in it, whether they are doing so effectively and, 
if not, what must change. 

How can scientists successfully pursue 
research outside their comfort zone? Some 
answers come from Rebekah Brown, director 
of Monash University’s Water for Liveability 
centre in Melbourne, Australia, and her col- 
leagues. They set out five principles for suc- 
cessful interdisciplinary working that they have 
distilled from years of encouraging researchers 
of many stripes to seek sustainability solutions 
(page 315). Similar ideas help scientists, curators 


and humanities scholars to work together ona 
collection that includes clay tablets, papyri, 
manuscripts and e-mail archives at the John 
Rylands Research Institute in Manchester, UK, 
reveals its director, Peter Pormann, on page 318. 
Finally, on page 319, Clare Pettitt reassesses 
the multidisciplinary legacy of Richard Francis 
Burton — Victorian explorer, ethnographer, 
linguist and enthusiastic amateur natural 
scientist who got some things very wrong, 
but contributed vastly to knowledge of other 
cultures and continents. Today’s would-be 
interdisciplinary scientists can draw many les- 
sons from those of the past — and can take our 
polymathy quiz online at nature.com/inter. = 


sg INTERDISCIPLINARITY 


. A Nature special issue 
7 | nature.com/inter 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 305 


© 2015 Macmillan Publishers Limited. All rights reserved 


ILLUSTRATION BY DEAN TRIPPE 


| NEWS FEATURE 


Interdisciplinary work is considered crucial by 
scientists, policymakers and funders — but 
how widespread is it really, and what impact 
does it have? Scholars say that the concept is 
complex to define and measure, but efforts to 
map papers by the disciplines of the journals 
they appear in and by their citation patterns 
are — tentatively — revealing the growth and 


. ‘ influence of interdisciplinary research. 
An analysis reveals the extent and impact 
of research that bridges disciplines. INTERDISCIPLINARITY 


\y A Nature special issue 
7 | nature.com/inter 
al 


BY RICHARD VAN NOORDEN 


1@ Interdisciplinary research is on the rise 


REFERENCES RHETORIC 
Since the mid-1980s, research papers have increasingly cited work outside their own disciplines. Discourse about interdisciplinary research is 
The analysis shown here used journal names to assign more than 35 million papers in the Web of increasing. The fraction of papers that mention 
Science to 14 major conventional disciplines (such as biology or physics) and 143 specialities. The interdisciplinarity in their title has fluctuated, 
fraction of paper references that point to work in other disciplines is increasing in both the natural perhaps reflecting the priorities of funders, but 
and the social sciences. The fraction that points to another speciality in the same discipline (for the twenty-first century saw that proportion 
example, a genetics paper pointing to zoology) shows a slight decline. reach an all-time high. 

Natural sciences and engineering Social sciences Papers with “interdisciplinar*” in title (%) 


— Social sciences and humanities 
— Natural sciences and engineering 


0.05 = 
a 
o 
2 
2 
—— References within same speciality OO Teaeedoe rtieaemcnmacntmianiaiaitietiatn eg 
— References to other specialties in same discipline 
== References to other disciplines 
) 0) 
1950 2010 1950 2010 1950 2010 
: i a a oz 
Interdisciplinary research takes time to have an impact 
IMPACT Three years after publication: less impact Thirteen years after publication: more impact 


Citations decrease as a paper’s interdisciplinarity increases. Citations increase as a paper’s interdisciplinarity increases. 


Whether interdisciplinary 


research gains more citations +15 See Race eeace ; ee 
8 Less interdisciplinarity More interdisciplinarity 


than disciplinary research is pati — In the long term, citations 
BRIO) cys dsgien tes seaveok apes teenaged Sheets Se Tach ieee eG es meee rise sharply when a 
paper’s references point 
to distant disciplines (for 


contentious. Over three years, 
papers with diverse references 
tend to pick up fewer citations 
than the norm, but over 

13 years they gain more. Some 
studies suggest that a little 


+5 example, engineering 


and biology). 


Change in average citations (%) 
° 


> 
interdisciplinarity is better than 5 5 Hae rar at 
a lot: papers that combine very a — Variety: the spread 
disparate fields tend to earn fewer 10 ---- “Bo of references 
citations. But interdisciplinary - atrpee clecir nes 
work can have broad societal and “15 = a i A a 
economic impacts that are not ey disciplines in references 
captured by citations. OG cal ncctiesrcesd devenaeetdvcsectitees io 


306 | NATURE | VOL 525 | 17 SEPTEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


SOURCE: V. LARIVIERE & Y. GINGRAS IN BEYOND BIBLIOMETRICS 
(EDS B. CRONIN & C. R. SUGIMOTO) 187-200 (MIT PRESS, 2014) 


SOURCE: J. WANG ETAL. PLOS ONE 10, E0127298 (2015) 


SOURCE: V. LARIVIERE & C. R. SUGIMOTO (PERS. COMM.) 


SOURCE: PERS. COMM./ELSEVIER; 
HTTP://GO.NATURE.COM/UCPXVD 


FEATURE | NEWS 


<a Some fields are more interdisciplinary than others... 


LOO% poorest sssigeereterneneiesessnevetgrenenenesinereteneneigvesanesetenerevesesiigretenereiesesareteteoparevesesinereteneveieyesaneretenenenesesanegesenenenesesanerestonegenevetanereseneneveneanereienenenseiiny 
A MEASURE OF INTERDISCIPLINARITY More interdisciplinary 
In this chart, more-interdisciplinary fields are i : i 
es in the top-right quadrant. A field’s position 
is determined by how much its papers cite 
outside disciplines (x-axis), and by how much 
outside disciplines subsequently cite the : = - : 
go |... Papers (y-axis). Papers from 2001-10 were GN sich 
assigned to disciplines and specialities on the ; 
basis of their journals, using a US National Cone eey 
Science Foundation classification system. : 
70 . $s ne : 7 oe a oe a ne a peesaret @. : Geriatrics & gerontology 
<> NATURE.COM General . 
For an interactive version of this : sien ee 
g graphic, visit: go.nature.com/z9m3gy j e e 
60k ~  * 9 tN seers re @®.......... ee TE ee lc ee 
2 ° e 
3 SF r 6 
mo} + : 
2 Chemical physics Physiology 
ac d e 
2 50 j = @ o @ r : 
E Probability & statistics__ 5@ O° | oe ee Materials science 
RS ® : : 
: : : & ® e Le eer and archaeology : 
AQ peinrinedinpiatinaituteart Vi GQ BW Pw Pr Se eo ‘aeseaenne euienmie es ee 
S) a * ee e oat 
eo fe e Nursing @ Arts 
e “ee @ Biology 
_@, : : 
30} oe we ° ; i @ Biomedical research 
e e* @ Chemistry 
. 4 © Clinical medicine 
e e : : @ Earth and space : 
20 . io ores so Lika upeeASagaipe a Teale PMPa Aas mmr nabs Secape as Sngh as FTea gash aeedpuahaiSeca ph aiSiaephPLen gabe aeLdeapheiRepaon T+ eaGNALaA ksh TAS EMC AALRA GR ENTAS RGN ALES pie DAENAEENAHLNE Ga D AERA GIIN TELA GATED . . eat 
inieraationa ° Soetie and technology 
e i ia 
@ ie a : @ Humanities 
@ << Inorganic nuclear chemistry @ Mathematics 
10 ee i Astronomy & astrophysics @ Physics 
“cnuclear& particle physics i @ Psychology 
: . . : : ® Social sciences 
Less interdisciplinary 


0% 10 20 30 40 50 60 70 80 90 100% 
References to outside disciplines 


...and so are some countries 


MOST INTERDISCIPLINARY COUNTRIES A separate analysis counted the proportion of a 
A 2015 study by researchers with the publisher Elsevier defined interdisciplinary papers as those paper’s references that are in other disciplines. 
that reference journals that are rarely cited together. The report looked only at countries that routinely — After totting up all the papers for each country, 
publish more than 30,000 papers per year to find the ‘most interdisciplinary’ countries for 2013. and normalizing the results (so that average 
interdisciplinarity = 1), similar nations emerge 


on top for 2013. 


13.0% y 
TM 11.9% M2% 11.0% 10.3% 


9.7% 9.7% 9.1% 3.59, 1. — China (1.09") 
2. India (1.07) 
3. Taiwan (1.06) 
4. Brazil (1.04) 
5. Australia (1.02) and South Korea (1.02) 


India Mainland Taiwan South Brazil Italy United Japan United Germany 
China Korea States Kingdom 19% higher than world’s average interdisciplinarity 


Publications in world’s top 10% 
of interdisciplinary papers (%) 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 307 
© 2015 Macmillan Publishers Limited. All rights reserved 


SOURCE: V. LARIVIERE (PERS. COMM.) 


Interdisciplinarity has become all 
the rage as scientists tackle society’s 
biggest problems. But there is still 


strong resistance to crossing borders. 


BY HEIDI LEDFORD 


308 | NATURE | VOL 525 | 77 SEPTEMBER 2015 


sking for US$40 million is never easy, but Theodore 
Brown knew his pitch would be a particularly tough 
sell. As vice-chancellor for research at the University 
of Illinois at Urbana-Champaign in the early 1980s, 
Brown had been tasked with soliciting a major dona- 
tion from wealthy chemist and entrepreneur Arnold Beckman, 
a graduate of the university. Beckman was hesitant, believing 
that the university should receive most of its support from the 
state. So Brown decided to devise a project like nothing he had 
ever seen before. 

In 1983, he and his colleagues put together a proposal for an 
institute that had little chance of being funded through normal 
channels. It would defy the powerful disciplinary cartography 
that defines many modern universities, bringing together 
members of different departments and inducing them to work 
together on common projects. Brown argued that it would allow 
faculty members to tackle bigger scientific and societal ques- 
tions than they normally could. 

“The problems challenging us today, the ones really worth 
working on, are complex, require sophisticated equipment and 
intellectual tools, and just don’t yield to a narrow approach,’ he 
says. “The traditional structure of university departments and 
colleges was not conducive to cooperative, interdisciplinary 
work,” 

It was an early example of the push for interdisciplinary 
research that is now sweeping universities around the globe. 
Although Brown was not completely alone — the interdiscipli- 
nary Santa Fe Institute in New Mexico was founded around the 
same time — he was advocating crossing boundaries before it 


© 2015 Macmillan Publishers Limited. All rights reserved 


ILLUSTRATION BY DEAN TRIPPE 


became fashionable. And his proposal met strong resistance. 
Department heads fretted that faculty members — and their 
grants — would be snatched away. Some colleagues scorned 
Brown's idea of creating open office spaces to foster interac- 
tions between graduate students: surely the din would make 
it impossible to get serious work done. And then there was 
the stigma. “Interdisciplinary research is for people who 
arent good enough to make it in their own field,’ an illustri- 
ous physicist chided. 

But Beckman liked the idea and committed the full 
$40-million asking price — at that time, the largest-ever 
private donation to a US public university. A few hectic 
years later, the 29,000-square-metre Beckman Institute for 
Advanced Science and Technology was born. 

The institute struggled to recruit a qualified director willing 
to take a chance on the new model, so Brown took the helm. 
Soon, large grants from organizations such as the Department 
of Defense and the National Science Foundation poured in, 
hushing many critics. By the time Brown left the institute in 
1993, other leading universities were sending delegations there 
to learn from the model. Researchers from Beckman — which 
now has more than 200 affiliated faculty members — have 
achieved attention-grabbing results, including helping to cre- 
ate one of the first graphical web browsers. 

Since the Beckman was founded, the interdisciplinary 
model has spread around the world, countering the trend 
towards specialization that had dominated science since the 
Second World War. Cross-cutting institutes have sprouted 
up in the United States, Europe, Japan, China and Australia, 
among other places, as researchers seek to solve complex 
problems such as climate change, sustainability and public- 
health issues. The interdisciplinary trend can be seen in pub- 
lication data, where more than one-third of the references in 
scientific papers now point to other disciplines (see page 306). 
“The problems in the world are not within-discipline prob- 
lems,” says Sharon Derry, an educational psychologist at the 
University of North Carolina at Chapel Hill who studies inter- 
disciplinarity. “We have to bring people with different kinds 
of skills and expertise together. No one has everything that’s 
needed to deal with the issues that we're facing” 

Even so, supporters of interdisciplinary research say that it 
has been slow to catch on, and those who do cross academic 
disciplines face major challenges when applying for grants, 
seeking promotions or submitting papers to high-impact 
journals. In many cases, scientists say, the trend is nothing 
more than a fashionable label. “There's a huge push to call 
your work interdisciplinary,’ says David Wood, a bioengineer 
at the University of Minnesota in Minneapolis. “But there’s 
still resistance to doing actual interdisciplinary science.” 


HIGHLY DISCIPLINED 


The idea of dividing academic inquiry into discrete categories 
dates back to Plato and Aristotle, but by the sixteenth century, 
Francis Bacon and other philosophers were mourning the 
fragmentation of knowledge. 

One problem lay in the rapid growth of science: there was 
too much information spread across the disciplines for any 
one person to handle. Science historian Peter Weingart of 
Bielefeld University in Germany 


the catalogue swelled from 10 pages to 2,300, covering 
7,000 species. 

In the nineteenth century, the disciplinary boundaries of 
the modern university started to take root. The disciplines 
surged in number and power after the Second World War, as 
nations, particularly the United States, boosted their research 
support. “It’s the moment when universities increased expo- 
nentially,” says Vincent Lariviére, an information scientist at 
the University of Montreal in Canada. “And the size of the 
university increased by creating more departments.” 

Tensions between the United States and the Soviet Union 
also played a part, says Weingart. The Soviets boasted a 
research programme geared towards solving societal prob- 
lems, for example improving agriculture to boost food secu- 
rity. By contrast, US President Dwight Eisenhower argued 
that basic research should be untethered. “In the field of intel- 
lectual exploration, true freedom can and must be practised? 
he said in a 1959 speech. And although basic research need 
not necessarily be disciplinary, it does not have the same 
pressure towards interdisciplinarity as does applied research. 


"WE RAVE TO BRING PEOPLE WITH DIFFERENT KINDS 
OF SKILLS AND EXPERTISE TOGETHER. NO ONE HAS 


EVERY TRING THAT S NEEDED, ° 


Specialities proliferated as individual disciplines were 
repeatedly subdivided. Biology was split into botany and 
zoology, then into evolutionary biology, molecular biology, 
microbiology, biochemistry, biophysics, bioengineering and 
more. Late last year, Jerry Jacobs, a sociologist at the Univer- 
sity of Pennsylvania in Philadelphia, counted the number of 
biology-related departments at Michigan State University in 
East Lansing. There were nearly 40. 

From this thicket, the term ‘interdisciplinary’ emerged. 
The earliest citation in the Oxford English Dictionary dates 
back to December 1937, in a sociology journal. But even at 
that time, some believed that the word was already over- 
used. In a report to the US Social Science Research Council 
in August that year, a sociologist at the University of Chicago 
in Illinois lumped ‘interdisciplinarity’ in with other “catch 
phrases and slogans which were not sufficiently critically 
examined” (R. Frank Items 40, 73-78; 1988). 

As an academic movement, interdisciplinarity caught on 
during the 1970s and has been growing ever since, says Lari- 
viere. He credits that rise in part to libraries, which began 
to stockpile subscriptions and improved researchers’ access 
to journals in alternative fields. A particle physicist could 
more easily browse biology journals, say. Furthermore, the 
US focus began to shift from basic research and scientific 
liberty back to societal problems such as environmental pro- 
tection, which can rarely be tackled by a single discipline. 

The United States was not alone: in 1994, an influential 
book partially sponsored by the Swedish Council for Plan- 
ning and Coordination of Research called The New Produc- 
tion of Knowledge (Sage) predicted, among other things, 
an increasingly interdisciplinary 


points to Carl Linnaeus’s taxo- 
nomic treatise Systema Naturae as 
an example: between its first edi- 
tion in 1735 and its last in 1768, 


g, INTERDISCIPLINARITY 


| A Nature special issue 
nature.com/inter 


future as science seeks to solve 
socially relevant questions. That 
book had an impact, says Lariv- 
iére, particularly in the European 


© 2015 Macmillan Publishers Limited. All rights reserved 


FEATURE | NEWS 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 309 


| NEWS FEATURE 


Union's Fifth Framework funding programme, which 
ran from 1998 to 2002 and emphasized interdisciplinary, 
problem-oriented research. 

Soon, interdisciplinary institutes began to sprout up 
around the world, each with its own unique structure and 
purpose. One of the first, the Santa Fe Institute, founded in 
1984, focused on applying advanced mathematics and com- 
putational skills to a range of disciplines. Others, such as 


“THERE IS CONSTANT PRESSURE ON ME 10 MAKE A 
CROSS-FACULTY, CROSS-INSTITUTION ALLIANCE. IF | 
WANT 10 BUILD ANEW BUILDING, THE MORE ALLIES 
HAVE, THE EASIER ITIS TO RAISE THE MONEY.” 


the Massachusetts Institute of Technology’s David H. Koch 
Institute for Integrative Cancer Research in Cambridge, or 
the neuroscience-focused Janelia Research Campus in Ash- 
burn, Virginia, tackle questions within a specific discipline 
but draw in work from other fields. And some, such as the 
Monash Sustainability Institute in Clayton, Australia, focus 
on specific problems. 

Even as the trend gained momentum, interdisciplinary 
researchers continued to hit the same hurdles that Brown 
had encountered. In 1998, chemist Richard Zare at Stanford 
University in California helped to launch the interdiscipli- 
nary institute Bio-X. But an influential colleague urged him 
not to move his lab into the Bio-X building. Doing so would 
essentially take Zare away from the chemistry department 
and his committee and teaching duties there, the colleague 
argued, weakening the department. 

Although he was well established, Zare worried about 
going against the establishment. “It was very serious,” he 
says. The risk is even greater for young professors seeking 
tenure, he notes. 

In 2004, in response to the growing interest in interdiscipli- 
nary work — and the challenges that face those who attempt 
it — the US National Academies released a report called 
Facilitating Interdisciplinary Research. The authors advised 
institutions to lower barriers, for example by making budgets 
flexible so that costs could be shared across departments. 

The publication drew a large audience. It has been down- 
loaded more than 7,600 times and had impact beyond US 
shores. At Durham University, UK, says physicist Tom 
McLeish, administrators referred to the report when they 
were forging a series of on-campus interdisciplinary centres. 
Around that time, McLeish was serving as pro-vice-chancel- 
lor of research, and saw interdisciplinarity as a way to make 
the small university shine on the world stage. He battled with 
department chairs who feared that the centres would reduce 
their budgets, and he worked to set up a promotion system 
that rewards investigators on large team grants in the same 
way as those on single-investigator grants. The university 
now has interdisciplinary centres on topics ranging from 
resilience — both ecological and psychological — to the 
history of medieval science. 

The interdisciplinary trend is also growing in Asia. In 
2000, the National Natural Science Foundation of China 
(NSEC) laid out a plan for interdisciplinary research, and 
universities have launched several cross-cutting centres 


310 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


over the past decade, including the Academy for Advanced 
Interdisciplinary Studies at Peking University in Beijing. The 
NSEC plans to launch further interdisciplinary projects in the 
coming years, says Yonghe Zheng, deputy director-general of 
the foundation’s Bureau of Science Policy. “China is a devel- 
oping country,’ he says. “So the universities and institutes can 
quickly set up some new centres which reflect the new trend 
in interdisciplinary research.” 

Nanyang Technological University in Singapore estab- 
lished its Interdisciplinary Graduate School in 2012; it 
already has 335 students, out of a total graduate-school 
population of 2,000. Nanyang’s interdisciplinary graduate 
programme, which bills itself as the first of its kind in Asia, 
was designed in part to expand the university’s fundraising 
options, says Bo Liedberg, dean of the programme. Because 
industry is often focused on real-world problems that cross 
disciplines, an interdisciplinary programme could foster 
more collaborations with business, he reasons. 

That focus on interdisciplinarity as a revenue stream is 
widespread, says Merlin Crossley, a molecular biologist and 
dean of the faculty of life sciences at the University of New 
South Wales in Sydney, Australia. “There is constant pressure 
on me to make a cross-faculty, cross-institution alliance,’ he 
says. “If I want to build a new building, the more allies I have, 
the easier it is to raise the money.’ Arizona State University 
in Tempe saw its federal funding rise by 162% from 2003 to 
2012 as it promoted interdisciplinarity across its campus (see 
Nature 514, 292-294; 2014). 

Despite this pressure, interdisciplinarity’s reach remains 
modest. For every Nanyang or Durham, there are hundreds 
of universities that have not embraced significant change. 
Departmental dividers remain in place — and in power — at 
most institutions, says Nancy Andreasen, a neuroscientist at 
the University of lowa in Iowa City who co-chaired the com- 
mittee that wrote the National Academies report more than 
a decade ago. “It has been an enormous disappointment.” 


For institutions or programmes that have embraced 
interdisciplinarity, the transition has not always been easy. 
The most common mistake is underestimating the depth 
of commitment and personal relationships needed for a 
successful interdisciplinary project, says Laura Meagher, 
a consultant based near St Andrews, UK, who coaches 
interdisciplinary teams. “You see people who think it’s not 
much more than stapling a bunch of CVs to the back of a 
proposal,” she says. “They dont realize that it takes time to 
build a relationship,” 

When the push for collaboration comes from the top, some 
of that focus on personal relationships could be lost — leav- 
ing the project to suffer, she says. The UK Energy Research 
Centre (UKERC) in London, which since 2004 has coordi- 
nated and carried out sustainable-energy research, learned 
how delicate interdisciplinary relationships can be, says Mark 
Winskel, a social and political scientist at the University of 
Edinburgh who evaluated the centre's first decade. Its initial 
five-year phase went well, he says, and culminated in a key 
publication: Energy 2050, which synthesized the institution's 
results and translated them into recommendations. But the 
next five-year phase failed to produce a similar achievement. 

Winskel surveyed members and found that changes in the 
UKERC’s structure designed to open it to a wider commu- 
nity — for example by offering several rounds of fresh grants 


© 2015 Macmillan Publishers Limited. All rights reserved 


in the middle of phase two — had upset some established 
long-term relationships. “We became a more diverse com- 
munity of scholars and disciplines,” he says. “But that also 
means you become less cohesive.” The UKERC learned from 
the experience: its third phase, launched in May 2014, aims to 
provide more stability for collaborative relationships. 

Social scientists in particular often face that lack of 
cohesion, says Thomas Heberlein, a social psychologist at the 
University of Wisconsin—Madison. When funders emphasize 
the societal impacts of the work they support, social scien- 
tists are often called in to assess the broader implications ofa 
project. But, he says, it is obvious — and insulting — whena 
social scientist is asked to join a project as a way to tick a box, 
without a true commitment to incorporating the discipline 
into the project. 


| SOCIAL STRUGGLE 


Several UK studies have found that social scientists are less 
likely than researchers in other disciplines to want to par- 
ticipate in interdisciplinary projects. For Heberlein, who 
has long collaborated with ecologists and environmental 
scientists, one of the stumbling blocks is what he calls “the 
hegemony of the natural sciences”. Those disciplines tend to 
be held in higher esteem than more qualitative fields such 
as the social sciences, and they are deemed more rigorous 
by funders and researchers, he says. That imbalance leads to 
frustration and undermines collaboration. Heberlein, whose 
speciality is in conducting surveys of public opinions, says 
that natural scientists often naively suggest that they can 
design and execute surveys themselves using an Internet 
tool such as SurveyMonkey. Heberlein disagrees: “It’s really 
hard to do the stuff we do,” he says. “Our measurements are 
complicated” 

Lack of respect can run in many directions when different 
kinds of researchers come together. Wood says that bio- 
engineers are always cautioned against having their grants 
reviewed by panels of biologists, who may be dismissive of 
engineering research goals and measurements. But he has 
also served on review panels in which engineers have recoiled 
at the limitations of clinical research. 

As more researchers become involved with interdiscipli- 
nary work, the mutual suspicion has started to ease. There 
have also been some signs of success in the funding arena. 
The US National Institutes of Health (NIH), for example, 
says that interdisciplinary proposals fare as well as, or slightly 
better than, more conventional applications. The European 
Research Council, by contrast, has noted that interdiscipli- 
nary grant proposals on average do not fare as well in review 
panels as projects that are narrower in scope. 

The atmosphere for publishing is also mixed. Interdiscipli- 
nary researchers have long complained that it is difficult to get 
their papers into top-tier disciplinary journals. Heberlein says 
that the rise of interdisciplinary journals has helped in his field, 
but he worries about the standard of some of the papers they 
publish. And he questions the wisdom of training graduate 
students across disciplines before they have immersed them- 
selves in the rigours of one area. “You've got to develop your 
disciplinary skills first” he says. “The bad news is the quality of 
this research is pretty bad and may be getting worse.” 

Many view the institutional push for interdisciplinar- 
ity as an experiment in progress. “The celebrations have 
begun, but the actual data on what kind of difference this 
makes are not in,” says Scott Frickel, a sociologist at Brown 


University in Providence, Rhode Island. 

As more institutions adopt new ways to organize research, 
some are also trying to rethink their assessment processes, 
says McLeisch. In July, he and his colleagues at Durham 
released a report called Evaluating Interdisciplinary Research, 
and he was surprised when academic societies and funders 
flocked to learn more. “We didn't anticipate that we'd be 
launching this report into an atmosphere where everyone 
wants to know this,’ he says. 

And the pace of change varies across the globe. In the 
United States, the NIH ran a programme to stimulate inter- 
disciplinary research from 2004 to 2012. It resulted in some 
changes, such as starting to recognize multiple principal 
investigators on what had been considered single-inves- 
tigator grants — a switch that removed a disincentive to 
collaborate. Since then, the agency has not perceived a need 
to follow up with any other incentives, noting that there are 
more than 4,000 active NIH-funded research projects that 
bill themselves as interdisciplinary. “Our general sense is that 
interdisciplinary research has become a very standard way 
of doing science,’ says Betsy Wilder, head of the NIH Office 
of Strategic Coordination. “It really pervades NIH funding” 

In some other countries, the experiment has just begun. 
Chemist Ayyappanpillai Ajayaghosh, director of the National 
Institute for Interdisciplinary Science and Technology in 
Thiruvananthapuram, India, says that momentum is building 
in his country to promote more interdisciplinary projects. In 


FEATURE | NEWS 


"TOU SEE PEOPLE WHO THINK ITS NOT MOCH MORE 
THAN STAPLING A BUNCH OF CVS 10 THE BACK OF A 
PROPOSAL TEV DON'T REALIZE THAT IT TAKES TIME 


TO BUILD A RELATIONS RIP. 


Japan, theoretical physicist Tetsuo Hatsuda left the University 
of Tokyo in part because he felt that the boundaries between 
disciplines were too heavily enforced there. In 2013, he joined 
the RIKEN research institute in Wako, Japan, and launched 
an interdisciplinary team of theoretical physicists, chemists 
and biologists to work out techniques that will accelerate 
all three fields. He hopes that the effort will stimulate more 
interdisciplinary work in the country. “Japan is a little behind 
other countries,’ he says. “Theoretical science is a good start- 
ing point because it is easy for us to interact.” 

Some 25 years after it opened, the Beckman Institute's exper- 
iment in interdisciplinary research has been a success, says 
Brown. The centre continues to attract distinguished faculty 
members and large team grants — last year it won a research 
contract worth up to $12.7 million from the federal govern- 
ment’s Intelligence Advanced Research Projects Activity 
programme — even though competition for such money has 
increased as more universities build interdisciplinary teams. 

And Brown bristles at the suggestion that the global 
push for interdisciplinarity might be a fad. “The answer is 
a resounding ‘no,” he says. “Things have changed — now 
people focus on big problems, and if you go for a big problem 
you need to be interdisciplinary.” m SEE EDITORIAL P.289 


Heidi Ledford writes for Nature from Boston, 
Massachusetts. 


17 SEPTEMBER 2015 | VOL 5 


© 2015 Macmillan Publishers Limited. All rights reserved 


25 | NATURE | 311 


Five Scholars Ethnographer Tally of 
principles for fruitful and technologists probe texts Richard Francis Burton, crackdowns under China’s 
partnerships p.315 ancient and modern p.318 reappraised p.320 , \ revised law p.321 


rh 


Global funders to focus 
on interdisciplinarity 


Granting bodies need more data on how much they are spending on work 
that transcends disciplines, and to what end, explains Rick Rylance. 


r | Three arguments are often made in 

favour of interdisciplinary research. 

First, complex modern problems such 
as climate change and resource security are 
not amenable to single-discipline investiga- 
tion; they often require many types of exper- 
tise across the biological, physical and social 
disciplines. Second, discoveries are said to be 
more likely on the boundaries between fields, 
where the latest techniques, perspectives and 


insights can reorient or increase knowledge’. 
The influence of big-data science on many 
disciplines is a good example. Third, these 
encounters with others benefit single disci- 
plines, extending their horizons. 


; INTERDISCIPLINARITY 


a A Nature special issue 
7 | nature.com/inter 


17 S 
© 2015 Macmillan Publishers Limited. All rights reserved 


EPTEMBER 2015 


The arguments against interdisciplinary 
work are also familiar. Devotees of normal- 
ized citation measures often contend that 
interdisciplinary research is inferior. Some 
fear that it drains funds, time and energy 
from ‘core’ disciplines. Research funders 
often hear complaints that schemes targeted 
at interdisciplinarity distract researchers. 
There is a persistent argument that ‘you can't 
have inter- disciplines without disciplines: 


VOL 525 | NATURE | 313 


uy 
ae 


Fa} 
i= 
<= 
= 
Ss 
7) 
tu 
a 
Zz 
<= 
Zz 
a 
ud 
rm 
n 
uu 
= 


> According to proponents of interdis- 
ciplinarity, obstacles abound. Academic 
institutions’ budgets, governance and pro- 
motion arrangements are usually organized 
around single disciplines, as are processes at 
many granting bodies and journals. Interdis- 
ciplinary research struggles for prestige — as 
measured by quantitative metrics that favour 
single disciplines — and it is trickier to peer 
review. Thus early-stage researchers are often 
advised that starting on an interdisciplinary 
trajectory is not a smart move. 

One striking aspect of this debate is how 
poor the consolidated data are on which to 
base judgements. This is why the Global 
Research Council (GRC) has selected 
interdisciplinarity as one of its two annual 
themes for an in-depth report, debate and 
statement between now and mid-2016. 
(The other is the position of women in sci- 
ence and research.) The GRC is a federation 
of more than 50 national research funders, 
with representatives from countries includ- 
ing Brazil, China, Japan, Russia, the United 
Kingdom and the United States. Participants 
include the US National Science Founda- 
tion, Research Councils UK (RCUK), Sci- 
ence Europe and the Chinese Academy of 
Sciences. I serve on the GRC’s governing 
board, in my capacity as chair of RCUK. 

As it has done in recent years with peer 
review and open access, the GRC aims 
to establish a common position on inter- 
disciplinarity — a topic on many people's 
minds worldwide, and one in which I havea 
personal interest. 


GROUND TRUTH 

So, what do we know? The 2014 Research 
Excellence Framework (REF) — a multi- 
year UK exercise that assessed universi- 
ties’ research strengths in 2008-13, and 
which thus determines funding — found 
that, when academics were asked to submit 
cases of research to REF that had significant 
impact outside academia, 80% were inter- 
disciplinary. However, items submitted to 
discipline-based REF panels under-repre- 
sented the quantity of top interdisciplinary 
research published by UK researchers in 
some fields’. These included health sciences, 
mathematics, information technology and 
the humanities. This is despite growth in UK 
interdisciplinary work overall. (The United 
Kingdoms share of the top 10% most inter- 
disciplinary research grew from 7.9% to 
9.1% in the four years to 2013.) In my view, 
this suggests that researchers perceive inter- 
disciplinary research to be vulnerable to 
discipline-based assessment. 

Further evidence comes from the UK 
government's recent triennial review of the 
country’s seven national research councils’. 
The review heard ‘evidence’ — what I con- 
sider opinion — to the effect that current 
structures did not serve interdisciplinary 


research well, and that it was significantly 
more difficult to gain funding for this than 
for mainstream activity. The review rec- 
ommended that RCUK — the councils’ 
umbrella body — investigate this, which it 
has been doing. 

It is difficult to get clear answers in 
response to the allegation that funding is 
more difficult to obtain for interdiscipli- 
nary work. Sample tests do not sustain the 
view that success rates for interdisciplinary 
grants are significantly adrift. But fund- 
ing data are not easily analysed in this way. 
This is in part because there are different 
schemes under which interdisciplinary work 
is undertaken: for example, through ‘grand 
challenge’-style programmes, fellowships or 
‘highlighted’ opportunities in mainstream 
schemes. Awards are also made in areas in 

which interdisci- 


“The generic plinarity is simply 

protocols of the norm, such as 
a scientific design. So, what 
paperandthose shouldbe included? 
for a piece of More fundamental, 
humanities however, is an issue 
research are of definition. What 
very different.” shouldbe measured 
when evaluating the 


funding of interdisciplinary activities? 

Arcane debates about whether research is 
inter-, multi-, trans-, cross- or post-discipli- 
nary complicate data collection. People also 
speak of methodological, theoretical, instru- 
mental, critical, restructuring and bridge- 
building interdisciplinarity*. I find this faintly 
theological hair-splitting unhelpful. But there 
are areas in which discrimination is impor- 
tant. One is the difference between ‘near- 
neighbour or ‘distant’ disciplines. 

Interdisciplinary research that involves 
neighbour disciplines is much more com- 
mon, and significantly easier to develop, 
than areas in which the disciplinary stretch 
is vast and the logistics and intellectual chal- 
lenge more demanding. This seems a signifi- 
cant point of analysis and one featured ina 
study” by the publisher Elsevier, which used 
a citation-based approach to review inter- 
disciplinarity in the United Kingdom. The 
measure considered the diversity of citations 
and the disciplinary distance between them 
to determine the extent of a paper's discipli- 
nary reach. The German Research Founda- 
tion (DFG) has used similar techniques for 
its funding portfolio, again demonstrating 
significant differences between ‘near’ and 
‘far’ interdisciplinarity — far research being 
more complex to undertake’. 


CASE STUDY 

Ihave personal experience of the challenges 
of interdisciplinary working. My back- 
ground is in English literature, but I have 
worked for many years on the history of psy- 
chology, in particular on the intersection of 


314 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


mind and biomedical systems. Separately, I 
work with neurologists on what the brain is 
doing when a person reads complex verbal 
artefacts such as poems. This is tested 
experimentally using functional magnetic 
resonance imaging. 

My personal interest is in why, in brain- 
processing terms, might culture be good 
for you (if it is)? Clinicians have different — 
but compatible — concerns, for example in 
recovering advanced reading functions and 
well-being following head injury. Education- 
alists are interested in information process- 
ing and interpretation. 

Of my two areas of research — one his- 
torical, the other experimental — the first 
is not much ofa stretch, intellectually or 
methodologically. The second is. I had to 
learn new things: to work in a team, to work 
with complicated machinery, to observe 
ethical protocols and to raise money. I have 
had to acquire knowledge of brain anatomy 
and statistical analysis, and learn a different 
research mindset. This has been far from 
straightforward. It has meant, for instance, 
adjusting how I think about elementary 
issues such as ‘what constitutes sufficient, 
appropriate evidence?’; methods of analy- 
sis; how inferential conclusions can be sus- 
tained; and how to write up results. 

The generic protocols of a scientific 
paper and those for a piece of humanities 
research are very different. This is a matter 
both of how to express oneself and of the 
way the proposition is shaped in the first 
place. I have found that it is easy to be too 
‘arty for the scientist and too ‘sciencey’ for 
the arts researcher. A humanities colleague 
remarked that the statistics “might as well be 
in Russian”; a scientist asked why the poems 
we used in the neurology experiments were 
by different people (for example, Shake- 
speare and Milton): couldn't we just write 
our own for consistency? 

And then there is the question of serial 
investigation. The cycle of grant, paper, 
grant, paper and so on does not pertain in 
the humanities, in which articles tend to 
emerge from longer projects that culminate 
in a book. In my experience, issues about 
raising grants (from whom?), satisfying 
peer review (from which constituency?) and 
gaining career recognition are relevant. But 
paramount is confronting the groundwork 
challenges that come with interdisciplinary 
work — especially those that require ‘stretch’ 
— and doing so with integrity, honesty anda 
degree of disciplinary self-denial. 

There is evidence that the first steps in 
establishing interdisciplinary projects are 
crucial. This was a finding of a review’ of the 
European Union's efforts to stimulate inter- 
disciplinary work under its Fifth Framework 
Programme for research development. Pro- 
jects did not succeed as well as they might 
have because they did not facilitate ‘enabling 


ANDREW BERTULEIT PHOTOGRAPHY/CORBIS 


conversations from the outset and because 
they lacked coherent leadership. Interdis- 
ciplinary work requires particular skills, 
mindsets and attention to establishing 
common ground**. 


FACT FINDING 

Interdisciplinarity will be a headline topic 
at the GRC annual meeting in Delhi in 
May 2016, organized by India’s Science 
and Engineering Research Board and 
RCUK. A report on the state of play 
worldwide is being commissioned by 
RCUK, on behalf of the GRC (the team to 
undertake the research will be appointed 
in October). 

The report will survey current policy 
and practice among global research 
funders. What forms of support do 
they offer to interdisciplinary research? 
How and where is it done? What are its 
outputs and impacts? The survey will 
begin to establish base data on how inter- 
disciplinarity can best be stimulated and 
managed, and look for good practice 
in this most precious and complex of 
research endeavours. 

The GRC expects to issue a policy 
statement following this meeting, as it has 
done previously on topical areas. These 
documents focus and clarify attitudes 
on key subjects. They marshal data that 
can be used while national policies are 
established and international coopera- 
tion is developed. We need much bet- 
ter definitions of what kind of thing we 
are supporting when and if we support 
interdisciplinary research, and better 
intelligence about what works. = 


Rick Rylance is chief executive of the 
Arts and Humanities Research Council, 
chair of Research Councils UK, anda 
member of the governing board of the 
Global Research Council. 

e-mail: r.rylance@ahrc.ac.uk 


1. Lakhani, K. R., Jeppesen, L. B., Lohse, P. A. 

& Panetta, J. A. The Value of Openness in 
Scientific Problem Solving Harvard Business 
School Working Paper (2007). 

2. Elsevier. A Review of the Uk’s Interdisciplinary 
Research Using a Citation-based Approach 
(HEFCE, 2015). 

3. Department for Business, Innovation & Skills. 
Triennial Review of the Research Councils (BIS, 
2014). 

4. Klein, J. T. in The Oxford Handbook of 
Interdisciplinarity (eds Frodeman, R. et al.) 
15-30 (Oxford Univ. Press, 2010). 

5. German Research Foundation. 
Interdisciplinary Review Processes: Structural 
Impact and Funding Success (DFG, 2013); 
available at http://go.nature.com/uyfxlp (in 
German). 

6. Bruce, A., Lyall, C., Tait, J. & Williams, R. 
Futures 36, 457-470 (2004). 

7. McLeish, T. & Strang, V. Leading 
Interdisciplinary Research: Transforming the 
Academic Landscape (Leadership Foundation 
for Higher Education, 2014). 

8. Whitfield, J. Nature 451, 872-873 (2008). 


Equipping cities to eather our changing climate takes many disciplines seortdng together. 


How to 
collab 


catalyse 
oration 


Turn the fraught flirtation between the social and 
biophysical sciences into fruitful partnerships 
with these five principles, urge Rebekah R. Brown, 
Ana Deletic and Tony H. F. Wong. 


n urgent push to bridge the divide 
Ame the biophysical and the 

social sciences is crucial. It is the only 
way to drive global sustainable development 
that delivers social inclusion, environmen- 
tal sustainability and economic prosperity’. 
Sustainability is the classic ‘wicked’ problem’, 
characterized by poorly defined require- 
ments, unclear boundaries and contested 
causes that no single agency or discipline is 
able to address’. 

It is crucial to understand, then, why so 
many well-meaning attempts at interdisci- 
plinary collaboration fail to deliver tangible 
outcomes — and why others succeed. Here 
we offer an unapologetically personal answer 
by reflecting on how, working across multi- 
ple faculties of Monash University in Mel- 
bourne, Australia, we have built a team of 


Ry , ? INTERDISCIPLINARITY 


i A Nature special issue 
_| nature.com/inter 


17 S 
© 2015 Macmillan Publishers Limited. All rights reserved 


disciplinary experts that delivers integrated 
and sustainable water management across 
multiple cities. 

We have now grown this interdisciplinary 
team to incorporate other institutions nation- 
ally and internationally. At the same time, we 
acknowledge that substantial transaction 
costs come with interdisciplinary research — 
it takes extra time and effort to make it work. 


PERSONAL JOURNEY 

Our journey began in the early 2000s, with 
two maturing groups working on urban 
water research: one in the faculty of engi- 
neering, focused on sustainable stormwater 
technologies, and the other in the faculty of 
arts, focused on urban water governance (see 
Supplementary Information; go.nature.com/ 
pjgbmn). The research teams had a common 
impact agenda, and our collaboration grew 
from a realization that an interdisciplinary 
approach would be more effective. In 2005, 
the two groups joined and secured funding 
for the establishment ofa Aus$4.5-million > 


EPTEMBER 2015 | VOL 525 | NATURE | 315 


> (US$3.1-million) Facility for Advancing 
Water Biofiltration* that brought together 
more than 20 Monash researchers and PhD 
students across civil engineering, ecology 
and sociology. By 2012, this had culminated 
in the award of a Aus$120-million Coopera- 
tive Research Centre (CRC) for Water Sensi- 
tive Cities. It comprises a partnership of more 
than 85 organizations, including 13 research 
institutions, and around 230 researchers and 
PhD students from more than 20 disciplines 
and subdisciplines across the social and bio- 
physical sciences and humanities. 

Over the past decade, our collaborations 
have increasingly made a practical differ- 
ence. We produce regular synthesis docu- 
ments (see, for example, ref. 5) containing 
technology information and enabling policy 
advice, written in an accessible way to facili- 
tate engagement and uptake. These have 
been heavily used in policy and strategy doc- 
uments, which speeded up the adoption of 
our research. For example, stormwater reg- 
ulations introduced in the state of Victoria 
in 2006 were underpinned by our research, 
and other state and local governments in 
Australia have adopted our recommended 
performance targets for the management 
of urban run-off. As a consequence, our 
stormwater-biofiltration technology has 
been increasingly adopted in cities across 
Australia‘, Singapore, China and Israel. 
Since 2010, our expanded framework for 
integrated city-wide water-cycle manage- 
ment”* has been used by governments (such 
as those of Australia, Singapore and China) 
and international organizations (such as the 


Asian Development Bank) to guide their 
strategic planning and investment. 

In that time, we have had to resolve 
considerable tension, which hinders mean- 
ingful collaboration. The biophysical sci- 
ences tend to have well-agreed theories; the 
social sciences spend much time developing 
(and often disagreeing on) theoretical ques- 
tions. Both fields have control and compari- 
son at their core. But biophysical researchers 
mainly perform quantitative research (often 
in well-controlled and replicable laboratory 
conditions), whereas social science can be 
qualitative or quantitative, and also use 
interpretative validation approaches. 

We witnessed biophysical researchers 
accusing social scientists of poor rigour and 
of spending too much time conceptualizing 
problems without exploring and offering 
solutions. Conversely, social scientists were 
often frustrated that biophysical researchers 
were too focused on solutions, reductively 
overlooking the wider societal implications 
of their proposed solutions. 

This discord is exacerbated by an inherent 
cultural hierarchy that often privileges the 
biophysical over the social sciences. Environ- 
mental problems have typically been framed 
from a biophysical perspective, meaning that 
social scientists are not effectively engaged in 
developing integrated solutions’. 


FIVE PRINCIPLES 

The journey was not for everyone, and we lost 
some talent along the way. Yet many stayed 
on. How did we help academics to overcome 
these biases? We used these five principles. 


MAKE IT MAINSTREAM 


Forge a shared mission. Driving our 
collaborative journey was the shared mis- 
sion of delivering water-management strat- 
egies that address the challenges of floods, 
droughts and degraded waterways. This 
approach fosters more sustainable, resilient, 
productive and liveable cities — for a healthy 
planet and population. The shared mission 
provided a compelling account of the overall 
goal of the collaboration, included impact 
as a necessary outcome, and was sufficiently 
broad to incorporate meaningful roles for all 
disciplinary researchers involved. 

This mission also maintained a sense of 
purpose in the face of occasional failure and 
of the ongoing investment of huge time and 
effort to appreciate the norms, theories and 
approaches of other disciplines. When we 
needed the input of certain disciplines, and 
hastily included researchers that did not 
share the mission, it was not a success. The 
subsequent departure of these researchers 
from the team initially weakened the skill set 
of the group, but provided the motivation 
to expand our collaboration across multiple 
institutions. 


Develop ‘T-shaped’ researchers. In our 
experience, interdisciplinary collabora- 
tions have the greatest chance of success 
when researchers are “T-shaped’!” — able to 
cultivate both their own discipline, and to 
look beyond it. Breadth and depth are key. 
T-shaped researchers build credibility by aim- 
ing for the highest scientific contribution in 
their field — a point of particular importance 
for early-career researchers, whose prospects 


Ways to promote interdisciplinary research 


Funders 


in cross-disciplinary research and offer insights into the norms and 


@ Manage funding from an interdisciplinary perspective while 
reinforcing research impact. Discipline-based agencies must form 
joint funding programmes. 

@ Panels should include a balance of experts from the social and 
biophysical sciences, with a strong appreciation of other disciplines. 
It is also useful to include end-users of the research (for example, 
practioners and policymakers). 

@ Calls for funding should request balance between disciplines and 
prefer teams that have a proven record of collaboration. Publication 
in applicants’ own disciplines should be essential; publishing in other 
disciplines is desirable. 


Institutions 

@ Introduce key performance indicators that promote T-shaped 
researchers. For example, include qualitative measures of impact on 
policy and practice, as well as conventional academic indices. 

@ Identify institutional research strengths that show potential for 
interdisciplinary collaboration and incentivize it through seed grants. 
@ Reduce transaction costs: for example, through summer schools to 
develop constructive dialogue skills. Provide platforms — seminars, 
research workshops, debating competitions — to discuss challenges 


316 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


cultures of other disciplines. Co-locate researchers from different 
disciplines who work on the same grand challenges. 

@ Invest in interdisciplinary PhD cohorts, co-supervised by academics 
from diverse departments or faculties. 


Publishers 

@ Invest in and create high-quality interdisciplinary journals, managed 
by editorial teams or boards of T-shaped researchers. 

@ Run special issues in high-impact, single-discipline journals that 
focus on interdisciplinary research. 

@ Peer reviewers should assess work using their disciplinary expertise, 
while being tasked to be open to innovations across disciplines. 


Researchers 

® Build stamina, patience and self-awareness to manage the long 
journey of establishing a productive interdisciplinary team. 

@ Put your best ideas forward even if they are unfinished, and be 
open to alternative perspectives from other disciplines, policymakers, 
industry practitioners and community members. 

® Prioritize depth early on, and embrace breadth by building 
relationships with those from other fields and practices. 


© 2015 Macmillan Publishers Limited. All rights reserved 


SOURCE: R.R.B., A.D., T.H.F.W. 


for promotion are judged against research 
excellence criteria (see principle 5). T-shaped 
researchers also engage actively with other 
disciplines (see principle 3) to understand and 
appreciate their norms, theories, approaches 
and breakthroughs. 

Many believe that interdisciplinary 
research delays career progression or is the 
luxury of senior researchers. This has not 
been our experience: many of our research- 
ers were able to maintain a high publication 
rate in their own discipline, and — as part 
of a team — secure increasing interdisci- 
plinary research funding. However, it took 
nearly five years to start publishing our joint 
interdisciplinary research in high-impact 
journals. 


Nurture constructive dialogue. Through a 
decade of trial and error, we have invested 
heavily in creating the environment and 
informal rules that empower researchers 
across all sciences to engage effectively, 
despite their vastly different approaches to 
research design and methodology, and their 
differing technical vocabularies and com- 
munication cultures. 

This has involved some commitments: to 
interact in plain English (disciplinary jargon 
is frowned on); to foster empathy and respect 
for different disciplinary norms; and to reflect 
on what is working in collaborative interac- 
tions. We designed regular interdisciplinary 
forums using these rules. This led to the co- 
development of key publications — for exam- 
ple, through interdisciplinary workshops, we 
have jointly written three annual reports for 
policymakers and water practitioners’. These 
activities grew into a sought-after annual 
short course and a massive open online 
course (MOOC) showcasing different disci- 
plinary approaches to urban water challenges. 

Reaching the ideal of constructive com- 
munication across the sciences takes time 
and practice — researchers new to the 
group may not yet have the necessary skills. 
Typically, they pass through three stages of 
development (see ‘Journey to T’). Initially, 
new collaborators tend to dominate dis- 
cussions and assert the primacy of their 
discipline. Soon after, they recognize the 
importance of other disciplines and adopt 
amore passive demeanour. Eventually, the 
researchers settle into a space of construc- 
tive dialogue. 

We find that some quit and others stay 
to become mature collaborators, able to 
co-create across academic disciplines and 
broader networks. The role of more expe- 
rienced collaborators is to support new 
colleagues’ personal journeys into these 
dynamic relationships. 


Give institutional support. Academic 
career pathways for interdisciplinary 
research are essential if it is to attract and 


Dominance (high) 


Listening OW)! | Nurture nascent skills in 


safe learning environments, 
interdisciplinary forums, 
synthesis workshops and 
writing groups. 


Behaviour 


JOURNEY TO T 


Researchers who are new to working with people 
from other disciplines oscillate between asserting 
the primacy of their own field and hanging back. 
With time they can become capable of breadth 
and depth (T-shaped), and able to engage in 
constructive dialogue and co-creation. 


dialogue 


Passivity (high) 


Support dynamic learning with 
informal rules such as plain 
speaking, open-mindedness, 
empathy and respect. 


Experienced researchers develop 
the skills for interdisciplinary 
working in enduring partnerships 
towards shared goals. 


Listening (high) 


retain the brightest and best. Monash 
University’s senior leadership team con- 
sistently signalled that it values research 
that is interdisciplinary, attracts significant 
industry involvement and delivers real- 
world impact — despite the organizational 
structures and global academic norms that 
are biased towards more conventional, dis- 
ciplinary approaches. 

This value was communicated to research- 
ers through university policies, promotion 
criteria and seed-funding programmes. 
For example, the engineering faculty has 
introduced qualitative research standards 
(alongside the conventional quantitative 

measures), that 


“Despite our attempt to meas- 
rewarding ure the a of 
experience, ae a se 
interdisciplinary ‘he facu : 
research is of engineering an 
; arts now award 
stillon the ra 
Sag 2? small competitive 
margins. 


grants to teams 
from both facul- 
ties to catalyse collaborations. 

Monash has established a PhD pro- 
gramme for cohorts of students working on 
a common global challenge across a number 
of disciplines; for instance, sustainable urban 
water management in developing Asian 
cities. These groups work in a constructive 
dialogue environment. 


Bridge research, policy and practice. 
Finally, the establishment of enduring 
connections between researchers, policy- 
makers and industry practitioners proved 
to be an important driver in growing our 
interdisciplinary collaborations. Refresh- 
ingly, industry rarely thinks in disciplinary 
silos. They tend to tackle complex problems 
from a range of perspectives, thereby model- 
ling integrated, solution-focused thinking. 
To ensure real-world impact, we engaged 


policy and industry partners in the design 
of our research programme and encouraged 
them to critique our scientific approach and 
presentation of results. We also ran frequent 
events that allowed professionals from policy 
and industry to interact with researchers. 
For example, in 2008, through a national 
roadshow, we showcased how our research is 
addressing crucial water challenges around 
Australian cities. Aimed at policymakers and 
industry and community leaders, it stimu- 
lated research and partnerships. 

Despite our rewarding experience, inter- 
disciplinary research is still on the margins. 
We urge researchers, institutions, and funding 
bodies committed to sustainable develop- 
ment to make it mainstream (see “Ways to 
promote interdisciplinary researcl’). m 


Rebekah R. Brown, Ana Deletic and 
Tony H. F. Wong are at Monash University 
in Melbourne, Australia, and in the 
Cooperative Research Centre for Water 
Sensitive Cities. R.R.B. is also director of the 
Monash Sustainability Institute. 

e-mail: rebekah.brown@monash.edu 


1. United Nations. Transforming our World: The 
2030 Agenda for Sustainable Development 
(UN, 2015). 

2. Rittel, H. W. J. & Weber, M. M. Policy Sci. 4, 
155-169 (1973). 

3. APSC. Tackling Wicked Problems: A Public Policy 
Perspective (Australian Government, 2007). 

4. Deletic, A., Fletcher, T. D., Brown, R. R., Hatt, B. E. 
& Wong, T. H. F. Water 35, 64-72 (2008). 

5. Wong, T.H. F. et al. blueprint2013 — Stormwater 
Management in a Water Sensitive City 
(Cooperative Research Centre for Water Sensitive 
Cities, 2013). 

6. Brown, R. R., Farrelly, M. A. & Loorbach, D. A. 
Glob. Environ. Change 23, 701-718 (2013). 

7. Wong T.H. F. & Brown R. R. Water Sci. Technol. 60, 
673-682 (2009). 

8. Brown, R. R., Keath, N. & Wong, T. H. F. Water Sci. 
Technol. 59, 847-855 (2009). 

9. ICSU. Earth System Science for Global 
Sustainability: The Grand Challenges 
(International Council for Science, 2010). 

10.Hansen, M. & von Oetinger, B. Harvard Bus. Rev. 
79, 106-116 (2001). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 317 


© 2015 Macmillan Publishers Limited. All rights reserved 


| COMMENT | BOOKS & ARTS 


INTERDISCIPLINARITY 


Inside Manchester’s ‘arts lab’ 


Peter E. Pormann on the revelations a meshing of technology and humanities can yield. 


igital pioneer Steve Jobs delivered a 
D potent commencement address at 

Stanford University, California, in 
2005. He described how, as an undergradu- 
ate, he had studied calligraphy rather than 
his prescribed curriculum (he later dropped 
out). Calligraphy may have seemed at the 
time to have no practical application, but a 
decade later, when Jobs was working on the 
Mac, it enabled him to promote proportional 
fonts and establish Apple as the gold standard 
in desktop publishing. Jobs fruitfully com- 
bined the “liberal arts and technology” — a 
phrase he used repeatedly in his last keynote 
addresses before his death in 2011. 

Productive interaction between the arts 
and sciences is at the heart of the John 
Rylands Research Institute at the University 
of Manchester, UK. Founded in April 2013, 
the institute (which I direct with associ- 
ate director and head of special collections 
Rachel Beckett) now has a staff of more 
than two dozen. It brings together scientists, 
conservators, curators, digital-imaging spe- 
cialists and humanities scholars to unravel, 
reveal and realize the research potential of 
the University of Manchester Library’s spe- 
cial collections. These run from clay tablets 
to e-mail archives. Highlights include Greek, 
Coptic and Arabic papyri, medieval Hebrew 
and Persian manuscripts and early-modern 
printed books — such as one of the world’s 
finest collections of volumes printed by 
Renaissance humanist Aldus Manutius. The 
institute was established in response to the 
rise of digital humanities, a field that enables 
the study of books and manuscripts in ways 
that were unimaginable a generation ago. 

There have been triumphs and tribula- 
tions. We have raised more than £3 million 
(US$4.6 million) in funding from sources 
such as the British Academy and biomedi- 
cal-research charity the Wellcome Trust. The 
institute sits in the already-crowded John 
Rylands Library, where its rapid growth is a 
challenge. But our ‘arts lab’ is taking research 
into uncharted territories by shattering 
disciplinary and institutional divisions. 

To make complex collaborations work, we 
instigated a buddy system. All researchers — 
PhD students, postdocs, visiting academ- 
ics and colleagues with funding for pilot 


Ses Gos INTERDISCIPLINARITY 


er VG A Nature special issue 
| nature.com/inter 


7 os 


| 


Erased text in the Syriac Galen Palimpsest is made visible by multispectral-image analysis. 


studies — are allocated a curator with inti- 
mate knowledge of the materials they study. 
Art-history postdoc Elizabeth Savage, for 
instance, won a three-year early-career 
fellowship from the British Academy to 
study thousands of fifteenth- and sixteenth- 
century prints collected by Hiero von Hol- 
torp, a nineteenth-century scholar of early 
printing technology and aesthetics. Her 
buddy is visual-collections manager Stella 
Halkyard, who helped to rediscover this 
remarkable legacy. Savage also works with 
colleagues at the library’s Centre for Heritage 


318 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


Imaging and Collection Care (CHICC), 
who pioneer innovations in colour print 
photography, such as lighting techniques 
for imaging gold. Combined with close-ups 
of pigments, these techniques have helped 
Savage to identify some of the earliest exam- 
ples of printed gold ink. 

Work at the CHICC is also revolutionizing 
understanding of papyri and palimpsests — 
manuscripts from which text has been erased 
to allow reuse of the page. Researchers have 
made detailed images of artefacts using 
cutting-edge technology: a 60-million-pixel 


MIKE TOTH/SIAM BHAYRO/DOUG EMERY/DIGITALGALEN.NET/CC BY 3.0 


digital sensor, combined with a MegaVision 
EV LED illumination system. This com- 
bines high-resolution photography with 
multispectral imaging, which captures data 
at frequencies across the electromagnetic 
spectrum. It can reveal once-unreadable 
texts, because different inks reflect light in 
different spectra differently. Thus papyrolo- 
gist Roberta Mazza has discovered the ‘Last 
Supper amulet; a papyrus with biblical pas- 
sages on one side and a grain-tax receipt on 
the other. Mazza traced its provenance to 
near ancient Hermoupolis in Egypt, close to 
modern Al Ashmunayn. 

We are also collaborating with scientists 
including Mark Dickinson, a physicist and 
medical-imaging specialist at Manchester's 
Photon Science Institute. Medical imaging is 
rich in techniques that can be used to analyse 
artefacts, such as optical coherence tomogra- 
phy, which is usually harnessed for imaging 
tissue or visualizing blood flow. Dickinson 
has tested it on carbonized papyri too deli- 
cate to unroll, revealing hidden text. 

Also key to investigating the collections 
is image analysis. We are using statistical 
techniques such as canonical variate analysis 
(CVA), which compares group structures 
in multivariate data, to read erased text on 
palimpsests. CVA is applied to a multispectral 
image and an algorithm is trained to recog- 
nize overlying text, the erased underlying text 
and areas where the two coincide. This effec- 
tively maximizes the contrast, so the under- 
text ‘pops’ out and becomes more readable. 

A £1-million image-analysis project that 
grew partly out of a collaboration with the 


CHICC and has received funding from the 
UK Arts and Humanities Research Council 
is studying the Syriac Galen Palimpsest. This 
is an eleventh-century liturgical work that 
carries an erased sixth-century undertext 
— a Syriac translation of On Simple Drugs 
by the classical physician Galen (around 
AD 129-216). We already had a large data 
set of multispectral images; now images of 
the same page are 


being combined to “The nature of 
make the under- the institute 
text more leg- binds ancient 
ible (see picture). artefacts to 


state-of -the- 
art science.” 


Overseeing this 
is computational 
primatologist Bill 
Sellers, who ordinarily uses computer mod- 
elling to reconstruct the movements and 
evolution of extinct species. 

All of this work generates large sets of 
images, stored as TIFF files. These raise the 
question of how to store and analyse big data. 
A challenge will be establishing integrated 
systems to allow comparative research across 
platforms. For Greek papyri and Hebrew and 
Persian manuscripts, we plan to develop solu- 
tions with the Cambridge Digital Library; this 
will feed into the iLibrary strategy to bring 
our digital collections and projects under 
one roof. We can also look at large amounts 
of texts and metadata with the tools of com- 
putational corpus linguistics — which studies 
language through samples of real text — and 
text mining, which hunts through text to 
extract data. One such tool is the language- 
processing software system U-Compare. 


BOOKS & ARTS | COMMENT | 


Some of our collections are born digital 
— for example, we hold the e-mail archives 
of local literary publishing house Carcanet 
— and future researchers will undoubtedly 
approach these differently from how they 
look at hand-written correspondence. We 
have begun to collaborate with computa- 
tional linguists at Manchester’s National 
Centre for Text Mining, as well as colleagues 
at the nearby Centre for Translation and 
Intercultural Studies, who have vast experi- 
ence with large sets of multilingual texts. And 
with palaeography — the study of ancient 
handwritings, their dating and their classi- 
fication — artificial intelligence might offer 
research avenues that the institute is keen to 
explore. By training software to recognize 
certain hands and writing styles, one might 
be able to query vast virtual collections of 
manuscripts in unprecedented ways. 

Delivering the institute’s inaugural lecture, 
historian Ann Blair of Harvard University in 
Cambridge, Massachusetts, said: “In embrac- 
ing new media, we must never discard the 
old ones.” The interdisciplinary nature of 
the institute is its signature, the tie that binds 
ancient artefacts to state-of-the-art science. 
These form a dual legacy for future genera- 
tions, who will want to ask different ques- 
tions of the library’s remarkable holdings. m 


Peter E. Pormann is founding director of 
the John Rylands Research Institute at the 
University of Manchester, UK, and principal 
investigator on the Syriac Galen Palimpsest 
project. 

e-mail: peter.pormann@manchester.ac.uk 


ANTHROPOLOGY 


One-man multidisciplinarian 


Clare Pettitt reassesses the legacy of Victorian polymath Richard Francis Burton. 


ichard Francis Burton (1821-90) 
Res for and mastered knowledge 

in so many fields — from geography 
to sexology — that his real legacy for science 
is muddied. The flamboyant polymath was 
an eminent explorer, a pioneer of ethnog- 
raphy and a linguist fluent in more than 
25 languages (from Arabic to Swahili) and 
a number of dialects. He wrote or trans- 
lated more than 40 volumes, including The 
Lake Regions of Central Africa, published 
155 years ago, and the first English edition 
of The Arabian Nights (1885). He was also 
an enthusiastic amateur of botany, geology 
and zoology, even running an experiment 
on monkey communication while living 
in Sindh (now Pakistan). Overall, this furi- 
ously energetic multidisciplinarian both 


contributed vastly to knowledge of other 
cultures and continents, and sometimes 
misread them to his — and their — cost. 
These complex interests were the fruit 
of a turbulent mind. The eldest son of an 
army family, Burton had a protean character 
shaped on the road as his parents moved their 
young family restlessly around France and 
Italy. He started to learn Latin at three years 
old and Greek at four, and quickly picked 
up French, Italian and local dialects. At the 
University of Oxford, UK, contemptuous of 
the teaching methods, he honed his mastery 
of languages but was expelled for attending 
a steeplechase. He was soon propelled into 
the Bombay Infantry and immersed himself 
in Indian languages and culture. Violent and 
mesmerizing by turns, he was viewed as both 


prodigiously gifted and morally suspect by 
his contemporaries — as an ‘other; just as he 
himself was possessed by otherness. 

By 1853, Burton had turned to explora- 
tion. Still beset by inner conflicts, he could 
also attract conflict with others. His great 
1856-59 expedition to East Africa with John 
Hanning Speke, instigated by the Royal Geo- 
graphical Society in London, was a case in 
point. It made “formidable contributions to 
imperial knowledge production’, according 
to historian Adrian Wisnicki. Although both 
men were seriously disabled by disease, Bur- 
ton became the first European to see Lake 
Tanganyika. He kept dense geographical and 
cultural notes and meteorological records, 
and collected specimens for what are now 
the Royal Botanic Gardens, Kew, and the > 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 319 


© 2015 Macmillan Publishers Limited. All rights reserved 


| COMMENT | BOOKS & ARTS 


> Royal School of Mines in London. But the 
expedition led to a bitter rivalry between the 
two over the source of the Nile, with Speke 
claiming it as the lake that he dubbed Lake 
Victoria, and Burton feeling that the evidence 
failed to add up. Long after their return, 
in 1864, the British Association for the 
Advancement of Science called for a debate 
in London, but Speke died of an unexplained 
gunshot wound the day before. “The charita- 
ble say that he shot himself, the uncharitable 
say that I shot him,” Burton wrote to a friend. 

Burton was shocked, but published The 
Nile Basin that year, reiterating his position 
in the Nile contro- 


versy first detailed “Burton’s : 

in The Lake Regions UNtimerstonin 
of Central Africa. © multitude of 
Burton felt that languages and 
Speke’s account, cultures gave 
Journal of the Dis- hima unique 
covery of the Source perspective on 
of the Nile (1863) humanity.” 
had dressed Africa 


up in flowery, fundamentally unscientific 
rhetoric, claiming for instance that a mass 
of dirty huts (in Burton’s words) was a vil- 
lage built on the most luxurious principles. 
Burton insisted on using indigenous names 
and learnt local languages so that he could 
communicate directly with people he met — 
and his investigations would prove invaluable 
to future explorers. “I undertook the history 
and the ethnography, the languages, and the 
peculiarities of the people,’ he is quoted as 
saying, adding scornfully that to Speke “fell 
the arduous task of delineating an exact 
topography”. Geography, Burton established, 
was a social as well as a physical science. The 
explorer Henry Morton Stanley would prove 
in 1875 that Speke had correctly identified 
the source of the Nile, but he used Burton's 
notes to get there. As Burton put it in Zan- 
zibar; City, Island, and Coast (1872), future 
expeditions “had only to tread in my steps”. 
Throughouta life of trailblazing travel and 
diplomacy — from Somaliland to Benin, Ara- 
bia, the Middle East, Asia and the Americas 
— Burton’ first epistemological framework 
for colonial encounters was the ‘Oriental- 
ist’ one of linguistic scholarship. But as an 
ethnographer, he was original. He mingled 
with the people whose cultures he studied, 
understanding that knowledge is embod- 
ied and must be historically contextualized. 
This was criticized in Victorian England, 
with its horror of ‘going native} but places 
him ahead of his time. Burton was always 
quick to acknowledge the contingencies 
and accidents that brought him into contact 
with local people, and never tried to efface 
himself from his narrative. Only in the late 
twentieth century did anthropologists such 
as John and Jean Comaroff suggest that the 
obvious weaknesses of ethnography as a 
‘science are also its strengths, as “participant 


Ethnographic pioneer and explorer Richard Francis Burton, photographed around 1860. 


observation ... connotes the inseparability of 
knowledge from its knower”. Studies from the 
1970s onwards supported this view, including 
Annette Weiner’s The Trobrianders of Papua 
New Guinea (Holt, Rinehart and Winston, 
1988), a reappraisal of Bronislaw Malinowski’s 
study of the Pacific Trobriand Islands, Argo- 
nauts of the Western Pacific (Routledge and 
Kegan Paul, 1922). 

In other ways, and much less attractively, 
Burton was very much of his time. His respect 
for Muslim culture did not preclude his suc- 
cumbing temporarily to a vicious racism that 
became particularly extreme in the 1860s and 
cannot be exonerated. By the mid-1860s he 
had become one of Britain’s foremost prom- 
ulgators of the polygenist thesis that Africans 
constituted a distinct and inferior species, and 
he helped to found the Anthropological Soci- 
ety of London, established after a dispute with 
the monogenist Ethnological Society. By his 
last decade, Burton had come to his senses, 


, INTERDISCIPLINARITY 


A Nature special issue 
29 | nature.com/inter 


320 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


embracing the view that all of civilization 
came from Africa, and felt that “negroes... 
have shown themselves fully equal in intel- 
lect and capacity to the white races of Europe 
and America’. But the damage had been done. 

Despite this sorry chapter, Burton’s 
immersion in a multitude of languages and 
cultures gave him a unique perspective on 
humanity, with “the enormous advantage 
of being capable of comparing native with 
foreign ideas and views of the world”. He 
knew that other cultures could never be 
fully ‘translated’ or subsumed into English, 
and that this militated against the ethos of 
Empire. He was perhaps less Orientalist than 
comparativist and relativist. His contribu- 
tion to the fledgling social sciences was all 
the more powerful, perhaps, for having been 
fed by so many streams of knowledge, even if 
this makes it less visible to us today. = 


Clare Pettitt is professor of nineteenth- 
century literature and culture at King’ College 
London. She is the author of Dr Livingstone, I 
Presume? and many articles about exploration 
and travel in Victorian print culture. 

e-mail: clare.pettitt@kcl.ac.uk 


HULTON-DEUTSCH COLLECTION/CORBIS 


Correspondence 


New environment 
law shows its fangs 


China's revised Environmental 
Protection Law went into effect 
on 1 January this year. Severe 
punishments for polluting 
businesses swiftly followed. 

Some 292 cases incurred 
an accumulating daily fine 
within the first 6 months, 
totalling 236 million yuan 
(US$37 million). The highest 
single levy was 15.8 million 
yuan (data from the Ministry of 
Environmental Protection; see 
www.mep.gov.cn). Over the same 
period, production was curtailed 
in 1,092 cases and equipment was 
locked down in 1,814 instances. 
Criminal charges were brought 
against 740 polluting businesses, 
and 782 were punished with 
police administrative detention. 

Local governments are 
cooperating with the new law, 
contrary to earlier misgivings (see 
B. Zhang and C. Cao Nature 517, 
433-434; 2015 and H. Yang et al. 
Science 347, 834-835; 2015). In 
Linyi in Shandong province, for 
example, several dozen businesses 
(including some responsible for 
high employment and large tax 
revenues) have been closed down. 
Dasheng Liu Shandong Institute 
of Environmental Science, Jinan, 
China. 
liu_sdiep@126.com 


Tailor checklists to 
clinical teams 


The problems of replicating the 
effects of patient-safety checklist 
trials in routine practice could 
be mitigated by adapting 
checklists for individual hospital 
environments and teams (see 
Nature 523, 516-518; 2015). An 
F-16 fighter aircraft would not 
rely on a checklist devised for 
flying a jumbo jet. 

For instance, much of the 
World Health Organization's 
surgical safety checklist 
is irrelevant to a cardiac 
catheterization procedure. There 
is no general anaesthetic or 
expected blood loss, for example, 


but monitoring kidney function 
is crucial. We therefore designed 
a bespoke safety checklist to brief 
the cardiac clinical team on the 
planned procedure and on any 
potential problems. Endorsed 

by the British Cardiovascular 
Society (www.bcs.com/ 
checklist), the checklist is 
regularly modified in response to 
end-user evaluation. 

Smart electronic checklists 
will further improve safety by 
highlighting patient-specific 
risks and acting as a guide in 
emergencies and for auditing 
near-misses. 

Thomas J. Cahill Oxford 
University Hospitals NHS Trust, 
Oxford, UK. 

Rod Stables Liverpool Heart and 
Chest Hospital, Liverpool, UK. 
thomas.cahill@cardiov.ox.ac.uk 


Mining shell waste 
will not be easy 


If the chemical industry is 

to profit from refining waste 
crustacean shells and other 
by-products of seafood 
processing, collection problems 
and food-safety issues need 

to be overcome (see N. Yan 

and X. Chen Nature 524, 
155-157; 2015). 

Gathering sufficient animal 
feedstock for commercial 
purposes will be a formidable 
challenge (R. L. Naylor et al. Proc. 
Natl Acad. Sci. USA 106, 15103- 
15110; 2009). The transport and 
storage of seafood by-products 
from different processing plants is 
also likely to be extremely costly. 

Moreover, expensive energy- 
intensive drying of crustacean 
shells would be necessary to 
prevent microbial growth and 
production of carcinogenic 
bacterial aflatoxins. Other 
health risks could arise from 
bioaccumulation of contaminants 
(such as heavy metals in shells) or 
from cross-species transmission 
of pathogens and perhaps even 
of prions through the food 
chain (L. Cao et al. Science 347, 
133-135; 2015). 

Hong-Wei Xiao, Zhen-Jiang Gao 


China Agricultural University, 
Beijing, China. 

A.S. Mujumdar McGill 
University, Quebec, Canada. 
xhwcaugxy@163.com 


Seal of approval for 
ocean observations 


We announce that the Pacific 
Islands Ocean Observing System 
was certified last month as the 
first regional partner to attain full 
membership of the US Integrated 
Ocean Observing System (IOOS). 
This certification is a hallmark of 
the quality of data provided by the 
IOOS, to the benefit of the public, 
the private sector and individuals. 

It is also an indicator to the 
global community that IOOS 
regional partners providing data 
from the oceans, Great Lakes and 
coasts of North America have 
met rigorous criteria for system 
oversight, information security, 
public engagement and financial 
controls. 

The IOOS includes federal 
and non-federal partners in an 
interagency investment by the 
US government of more than 
US$2 billion annually for the 
collection and provision of ocean 
data and for improved forecast 
capabilities. It comprises about 
10,000 unique oceanographic 
data sets and some 4,000 services 
that provide data, metadata 
and refined data products to 
tens of millions of US users. For 
instance, IOOS data are used in 
search-and-rescue operations 
and to ensure safe operation of 
commercial vessels. 

Certified IOOS data can 
be entered in the permanent 
US archive at the National 
Centers for Environmental 
Information and can be used 
internationally by the Global 
Telecommunication System for 
meteorological data. 

Chris E. Ostrander University 
of Hawaii at Manoa, Honolulu, 
Hawaii, USA. 

Conrad C. Lautenbacher 
GeoOptics, Dunwoody, Georgia, 
USA. 

chriso@hawaii.edu 


Lack of help stymied 
community care 


John Foot'’s book on psychiatrist 
Franco Basaglias movement to 
reform Italy’s psychiatric hospitals 
ends with the passing of Law 180 
in 1978 to close down asylums 
(see A. Tone Nature 524, 290; 
2015). Sadly, the law was poorly 
implemented owing to woefully 
inadequate resources. 

Families received little or no 
support in caring for those who 
returned home. For some it 
was too much, forcing general 
hospitals to take up the slack. 
Psychiatrists found their hands 
tied when confronted with people 
who were seriously mentally 
ill, so many ended up in prison 
stigmatized as criminals. 

Even Basaglia’s widow, Franca 
Ongaro Basaglia — a core 
member of the reform movement 
and later an Italian senator — 
described Law 180 as a failure. 
Laura Spinney Paris, France. 
Ifspinney@gmail.com 


Education reforms 
ring true 50 years on 


Stephen Bradforth and colleagues’ 
discussion of what is needed 
to develop “a science-literate 
population” (Nature 523, 282-284; 
2015) echoes the words of a Nature 
editorial 50 years ago, entitled 
‘New thinking in undergraduate 
teaching’ (Nature 205, 835; 1965). 
According to the editorial, 
“the student is in danger of 
spending too much of his [sic] 
limited time memorizing facts, 
and has insufficient time at his 
disposal to master the principles 
underlying his subject and to 
develop his powers of thought” It 
continues: “the most important 
purpose ofa university education 
is to teach the student to think 
for himself... it may on occasion 
demand a re-examination of the 
whole approach to a subject in 
undergraduate courses.” Indeed. 
Barry S. Winkler Eye Research 
Institute, Oakland University, 
Rochester, Michigan, USA. 
winkler@oakland.edu 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 321 
© 2015 Macmillan Publishers Limited. All rights reserved 


BRIEF COMMUNICATIONS ARISING 


Decompensated cirrhosis and microbiome 


interpretation 


ARISING FROM N. Qin et a/. Nature 513, 59-64 (2014); doi:10.1038/nature13568 


The diagnosis of cirrhosis, especially in the advanced/decompensated 
stages, is made using simple and inexpensive clinico-radiologic- 
pathological techniques’. Qin et al.’, whose paper has replicated 
prior studies*~*, reported a relatively novel profile to diagnose cirrho- 
sis using complex stool metagenomics despite having a majority (65% 
discovery and 76% validation cohorts) decompensated cirrhotic 
population. We have found that the decompensated cirrhosis cohort, 
which does not require these complicated diagnostic strategies, was 
responsible for a significant proportion of these microbiota changes 
on further analysis of their metagenomics data and using a new cohort 
of 360 subjects. Therefore, given several confounders and the ease of 
decompensated cirrhosis diagnosis using current techniques, a careful 
re-interpretation of newer microbiota-based diagnostic strategies 
that do not a priori differentiate between early (compensated) and 
decompensated cirrhosis and treat all people with cirrhosis as one 
uniform population should be performed. There is a Reply to this 
Brief Communication Arising by Qin, N. et al. Nature 525, http:// 
dx.doi.org/10.1038/nature14852 (2015). 

A major confounder in people with cirrhosis are standard of care 
therapies such as lactulose, rifaximin, antibiotics and acid-suppressants 
that can affect the gut milieu’®. These alone could explain a large portion 
of the metagenomics changes and have not been accounted for*’”. 
These medications, especially proton pump inhibitors, could also be a 
major reason why oral origin bacteria are found in the intestine, as has 
been shown in prospective cirrhotic and non-cirrhotic studies’*”. 

We hypothesized that there was a significant difference in compen- 
sated versus decompensated cirrhotic microbiota in Qin et al.’, which 
needs to be accounted for in the interpretation. Using 66 enriched/ 
depleted metagenomic sequences (MGS) provided by S. D. Ehrlich, 
we performed linear discriminant analysis (LDA) effect size (LEfSe)'” 
after classifying them into healthy, compensated and decompensated 
subjects. LEFSe uses a factorial Kruskal-Wallis and LDA test to 
detect features with significant differential abundance. We found that 
even in the selected data set the authors provided, 17 of 66 MGS were 
different between compensated and decompensated groups (10 MGS 
overexpressed and 7 MGS underexpressed, Fig. 1a). These included 
several oral origin species (Streptococcus oralis and several Veillonella 
spp.), which were the primary study results. We then enrolled 360 age- 
matched subjects (45 healthy individuals (age 54+3 years, no 
chronic diseases), 171 compensated (age 54+4 years, median 
Child-Pugh score 6) and 141 decompensated cirrhotic patients (age 
55 +2 years, median Child-Pugh score 9)) for stool multi-tagged 
pyrosequencing (MTPS)". Using Kruskal-Wallis analysis of relative 
microbial family abundance >1%, we found that compensated 
and decompensated patients were significantly different (Fig. 1b). 
Proteobacteria levels, specifically Enterobacteriaceae, were signifi- 
cantly higher in decompensated cirrhotic patients. This pattern is also 
seen in other recent MTPS studies*"*. Although MGS and MTPS are 
not completely comparable, it is interesting that both resulted in 
similar conclusions. Therefore, there are significant microbiota dif- 
ferences between compensated and decompensated patients that need 
to be separated in cirrhosis microbial studies. 

In addition, in Qin et al.” the calculation of the model for end-stage 
liver disease (MELD) score in Supplementary Table 1 is inaccurate, 
casting doubt on figure 2. The authors compared diabetes patients 


with cirrhotic patients to inform their cirrhosis-associated profile. 
However, diabetes is prevalent and is associated with a poor prognosis 
in cirrhosis’. Therefore these results are not generalizable to patients 
with cirrhosis and diabetes. 

The present need is not for complicated profiles that are unlikely to 
supplant currently available simple diagnostic strategies, but rather 
for improving prognostication. This is because gut microbiota are 
associated with several cirrhosis-related pre-terminal events such as 
hepatic encephalopathy and infections’. A prior study has shown that 
altered stool microbiota can predict poor outcomes, but further work 
is required’. 


a HE. Compensated Hi Decompensated 


Haemophilus parainfluenzae SEE ras] 
Veillonella dispar 
Veilonella parvula 
| Veillonella sp. Ts 
Veilonela atypice 
Fusobacterium nucleaum (I ET TT 
Campylobacter sp. Ty) 
Aggregatibacter segnis [Tn 
Streptococcus vestibularis — 
Steptococcus oralis | 
Alistipes indistinctus 


DE “istipes 
EEE Clostridium symbiosum 
| EE Coprococcus comes 
‘DEE Bilophita wadsworthia 
EEE Oscilibacter 
DEE Unknown sp. 
i | | i 
-4 -2 ) 2 
l0g,, (LDA score) 
b 1.05 + 


0.84 


0.64 


0.44 


0.24 


Abundance 

Oo 
7: 
-at—— + 
= ===] —_ 
| 
fhe 
es + 
I 
fi + 
[eres + 
fee 

+ + 
z+ + 
i+ + 
fe 
Gow i 
fe 
oes 
| ees + 
I 
es 
Poe eee + ae + 


& & Ss & & & & 
J J J J iS J 
© : 
&§ Rs & § &§ Ns & 
O° eS & ss ‘ S < 
3 S§ @ @ 
x ~~ 9g 2 D << } 
é ve SF s Re) & 
& Aa o < S Ret 
¥& e BN Ss & RY Re 
uf Na Ss) we 
N & & 


Figure 1 | Microbiota distribution between compensated and 
decompensated cirrhotic subjects. a, LFSe plot showing metagenomic species 
that are overexpressed (green) and under-expressed (red) in decompensated 
compared to compensated cirrhosis from Qin et al.’. b, In the new data set 
using MTPS, boxplots showing interquartile range of median abundance of 
statistically significant comparisons between controls (orange), compensated 
cirrhosis (green) and decompensated cirrhosis (blue) using multiple 
corrections-adjusted Kruskal-Wallis tests at the family level. The line in 

the centre shows median. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | El 


©2015 Macmillan Publishers Limited. All rights reserved 


BRIEF COMMUNICATIONS ARISING 


Therefore, the careful separation of the two groups within cirrhosis, 
which have different diagnostic criteria and prognoses, and the control 
of confounders owing to drugs mentioned above, are important for the 
correct interpretation of these results and to avoid epiphenomena. 


Jasmohan S. Bajaj’, Naga S. Betrapally” & Patrick M. Gillevet” 
1Division of Gastroenterology, Hepatology and Nutrition, Virginia 
Commonwealth University and McGuire VA Medical Center, Richmond, 
Virginia 23249, USA. 

email: jsbajaj@vcu.edu 

*Microbiome Analysis Center, George Mason University, Manassas, 
Virginia 20110, USA. 


Received 12 October 2014; accepted 8 June 2015. 


1. Schuppan, D. & Afdhal, N. H. Liver cirrhosis. Lancet 371, 838-851 (2008). 

2. Qin,N.etal. Alterations of the human gut microbiome in liver cirrhosis. Nature 513, 
59-64 (2014). 

3. Bajaj, J. S. et al. Linkage of gut microbiome with cognition in hepatic encephalo- 
pathy. Am. J. Physiol. Gastrointest. Liver Physiol. 302, G168-G175 (2012). 

4. Chen, Y. etal. Characterization of fecal microbial communities in patients with liver 
cirrhosis. Hepatology 54, 562-572 (2011). 

5. Bajaj, J. S. et al. Colonic mucosal microbiome differs from stool microbiome in 
cirrhosis and hepatic encephalopathy andis linked to cognition and inflammation. 
Am. J. Physiol. Gastrointest. Liver Physiol. 303, G675-G685 (2012). 

6. Chavez-Tapia, N. C., Tellez-Avila, F. |., Garcia-Leiva, J. & Valdovinos, M. A. Use and 
overuse of proton pump inhibitors in cirrhotic patients. Med. Sci. Monit. 14, 
CR468-CR472 (2008). 


Qin et al. reply 


7. Bajaj, J. S. et al. A longitudinal systems biology analysis of lactulose withdrawal in 
hepatic encephalopathy. Metab. Brain Dis. 27, 205-215 (2012). 

8. Bajaj, J.S. et al. Altered profile of human gut microbiome is associated with 
cirrhosis and its complications. J. Hepatol. 60, 940-947 (2014). 

9. Bajaj, J.S. et al. Modulation of the metabiome by rifaximin in patients with 
cirrhosis and minimal hepatic encephalopathy. PLoS ONE 8, e60042 (2013). 

0. Kanno, T. et al. Gastric acid reduction leads to an alteration in lower intestinal 
microflora. Biochem. Biophys. Res. Commun. 381, 666-670 (2009). 

11. Bajaj, J.S. et al. Systems biology analysis of omeprazole therapy in cirrhosis 
demonstrates significant shifts in gut microbiota composition and function. Am. 
J. Physiol. Gastrointest. Liver Physiol. 307, G951-G957 (2014). 

2. Segata, N. et al. Metagenomic biomarker discovery and explanation. Genome Biol. 
12, R60 (2011). 

3. Gillevet, P., Sikaroodi, M., Keshavarzian, A. & Mutlu, E.A. Quantitative assessment of 
the human gut microbiome using multitag pyrosequencing. Chem. Biodivers. 7, 
1065-1075 (2010). 

14. Zhang, Z. et al. Large-scale survey of gut microbiota associated with MHE 

via 16S rRNA-based pyrosequencing. Am. J. Gastroenterol. 108, 1601-1611 

(2013). 

15. Elkrief, L. et a/. Diabetes mellitus is an independent prognostic factor for major 

liver-related outcomes in patients with cirrhosis and chronic hepatitis C. 
Hepatology 60, 823-831 (2014). 


Author Contributions J.S.B. supervised the patient recruitment, sample collection 
and clinical analysis. He was also involved in the data interpretation, analysis and 
drafting of the manuscript. N.S.B. was involved in data analysis and interpretation. 
P.M.G. was responsible for data analysis, interpretation and drafting of the manuscript. 
All authors participated in the critical revision of the manuscript. 


Competing Financial Interests Declared none. 


doi:10.1038/naturel4851 


REPLYING TO J. S. Bajaj, N. S. Betrapally & P. M. Gillevet Nature 525, http://dx.doi.org/10.1038/nature14851 (2015) 


In the accompanying Comment’, a concern expressed by Bajaj et al. is 
that diagnostics of liver cirrhosis by microbiome analysis that we 
report” may be mainly due to the microbiome alterations in decom- 
pensated patients (DP). To address it we tested how accurately com- 
pensated patients (CP) can be diagnosed by microbiome analysis. 
Two slightly different criteria of identifying these were used, based 
on absence of ascites and hepatic encephalopathy (n= 54) and 
absence of ascites only (n = 57). 

First, we constructed a discriminator of patients (P, n = 98) and 
healthy controls (H, n = 83) in the discovery cohort, disregarding 
the patient status (CP or DP). For that we used as input the presence 
and abundance of 66 metagenomic species (MGS) differentially 
represented in the two groups’ and as output area under curve 
(AUC) ofa receiver operator characteristic (ROC) analysis, essentially 
as described previously**. The optimal discriminator required 7 MGS 
only and yielded an AUC of 0.95 for the discovery cohort and 
of 0.94 for the validation cohort (P n=25; H n=31), values 
somewhat higher than those observed for the discriminator based 
on 15 biomarkers’. The discriminator stratified the CP (n = 54 or 
n=57) from H (n= 114) as accurately as the DP (n=69 or 
n = 66), with an AUC of 0.95 for all. This shows that the gut micro- 
biome alterations in the two types of patients have highly similar 
features. These features are not greatly affected by medication, 
another concern expressed by Bajaj et al.', as the discriminator strati- 
fied with a comparable efficiency H (n = 114) from P that were taking 
antiviral medication (n = 52) or not (n = 71) with an AUC of 0.95 for 
both; taking B-blockers (n = 11) or not (n = 112), with an AUC of 
0.95 and 0.96, respectively; or taking PPI (n = 70) or not (n = 53), 
with an AUC of 0.96 and 0.93, respectively. We suggest that the 
inability to construct an efficient discriminator of H and CP by 


E2 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


Bajaj et al.’ may be due to an inadequate resolution provided by the 
broadly used gene encoding the 16S ribosomal RNA, which remains 
generally at the genus level rather than the species one achieved by 
quantitative metagenomics we deploy” *. 

Notwithstanding the similarity of the gut microbiome alterations 
in CP and DP, there are also differences between the two groups, as 
suggested by the association of the disease severity scores and the load of 
the liver cirrhosis-enriched species’. Bajaj et al.’ rightly point out 
an inaccuracy of the calculation of the model for end-stage liver disease 
(MELD) score in our report’, which refers to previous literature; how- 
ever, the correction had a modest effect, the statistical significance 
between the scores of patients with the lowest and the highest LC quart- 
ile load being P< 2X 10° rather than the reported P< 1X 10°. 

To further explore the microbiome alterations in CP and DP we 
searched for the MGS having a significantly different abundance in 
the two groups, following the approach used for identifying 66 species 
enriched in C or P groups’. Some 30 such MGS were found in the 
discovery cohort (CP n = 45; DP n = 54), but only 13 were not present 
in the set of 66. All 79 species were used to construct the best discrim- 
inator for the discovery cohort. It was based on 14 MGS and stratified 
the CP and DP of the discovery cohort with an AUC of 0.87 and those of 
the validation cohort (CP n = 9; DP n = 16) with an AUC of 0.84. 

This analysis confirms our finding that the alterations of the gut 
microbiome are associated with the severity of the disease. However, it 
provides no evidence for a saltatory alteration to a different composi- 
tion upon decompensation, which could confound microbiome ana- 
lysis, as suggested by Bajaj et al.'; a gradual alteration with the severity 
would lead to the same result. 

Diabetes has been excluded in the patient enrolment in our study’. 
Furthermore, invasion of the gut by oral species was not observed in 


©2015 Macmillan Publishers Limited. All rights reserved 


BRIEF COMMUNICATIONS ARISING 


the previous studies of the type-2 diabetes, notwithstanding the use of 
quantitative metagenomics, which would have easily revealed them 
were they present®®. Alterations of the gut microbiome owing to liver 
cirrhosis are therefore unlikely to be confounded by diabetes and the 
diagnostics of the two pathologies by the gut microbiome analysis 
remains a real possibility. Short-term nutritional changes, such as 
hospital diet, generally have only a modest effect on gut microbiota; 
long term dietary patterns, which affect it more’*, are not very sig- 
nificantly different for the cirrhosis patients and healthy controls in 
the Chinese population from which the participants enrolled in our 
study were drawn’. 

In conclusion, while we adhere to the call of Bajaj et al.’ for 
caution regarding potential confounders in microbiome analysis, we 
strongly disagree with their suggestion that the alterations we report 
are “epiphenomena” rather than actual differences of gut microbial 
communities associated with liver cirrhosis. We suggest that micro- 
biome analysis might supplant current inadequate clinical diagnostic 
parameters and/or invasive procedures such as liver biopsy for detect- 
ing compensated cirrhosis. 


Nan Qin?*, Emmanuelle Le Chatelier®*, Jing Guo’, Edi Prifti®, 
Lanjuan Lit? & S. Dusko Ehrlich? + 

1State Key Laboratory for Diagnosis and Treatment of Infectious Disease, 
The First Affiliated Hospital, College of Medicine, Zhejiang University, 
310003 Hangzhou, China. 

email: |jli@zju-edu.cn 


?Collaborative Innovation Center for Diagnosis and Treatment 

of Infectious Diseases, Zhejiang University, 310003 Hangzhou, 

China. 

’Metagenopolis, Institut National de la Recherche Agronomique, 78350 
Jouy en Josas, France. 

email: dusko.ehrlich@jouy.inra.fr 

4king’s College London, Centre for Host-Microbiome Interactions, 
Dental Institute Central Office, Guy’s Hospital, London Bridge, London 
SE1 ORT, UK. 

*These authors contributed equally to this work. 


1. Bajaj, J. S., Betrapally, N. S. & Gillevet, P. M. Decompensated cirrhosis and 
microbiome interpretation. Nature 525, http://dx.doi.org/10.1038/nature14851 
(2015). 

2. Qin, N. et al. Alterations of the human gut microbiome in liver cirrhosis. Nature 
513, 59-64 (2014). 

3. Le Chatellier, E. et a/. Richness of human gut microbiome correlates with metabolic 
markers. Nature 500, 541-546 (2013). 

4. Cotillard, A. etal. Dietary intervention impact on gut microbial gene richness. Nature 
500, 585-588 (2013). 

5. Qin, J. et al. A metagenome-wide association study of gut microbiota in type 2 
diabetes. Nature 490, 55-60 (2012). 

6. Karlsson, F. H. et a/. Gut metagenome in European women with normal, impaired 
and diabetic glucose control. Nature 498, 99-103 (2013). 

7. Wu, G. D. et al. Linking long-term dietary patterns with gut microbial enterotypes. 
Science 334, 105-108 (2011). 

8. Claesson, M. J. et al. Gut microbiota composition correlates with diet and health in 
the elderly. Nature 488, 178-184 (2012). 


doi:10.1038/nature14852 


17 SEPTEMBER 2015 | VOL 525 | NATURE | E3 


©2015 Macmillan Publishers Limited. All rights reserved 


NEWS & VIEWS 


For News & Views online, go to 
nature.com/newsandviews 


Forgetfulness illuminated 


Memories are stored in the complex network of neurons in the brain. With the help of innovative tools to manipulate the 
connections between neurons, memories in mice can now be erased with a beam of light. SEE ARTICLE P.333 


JU LU & YI ZU0 


ore than a century ago, the German 
Meee Richard Semon proposed 

that memories leave physical traces 
in the brain, and coined the term ‘engram” 
to describe such traces’. Although the con- 
cept has gained general recognition, the 
search for the engram is ongoing. In this 
regard, the synapse — a specialized connect- 
ing region between neurons — has received 
much attention, but there is still no direct 
evidence of a causal link between synaptic 
changes and memory formation. In this issue, 
Hayashi-Takagi et al.’ (page 333) fill this gap. 
Using ingenious protein engineering and live 
imaging, the authors identify which synapses 
are activated when a mouse learns a motor 
skill, and then weaken these synapses to erase 
motor memory. 

Most synapses in the brain form between 
axons (neuronal ‘output cables’) and dendrites 
(input cables). Signals to excitatory synapses 
are usually received by micrometre-sized 
protrusions called spines that emanate from 
dendrites. The size of the spine head correlates 
with the strength of the synapse’. Spines may 
emerge, disappear or change in size during 
learning and memory formation, reflecting 
changes in the wiring of neuronal circuits’. 

To investigate the causal relationship 
between the formation of motor memo- 
ries and the structural potentiation of spines 
(spine formation or enlargement), Hayashi- 
Takagi et al. developed an ‘optoprobe’ called 
AS-PaRacl that manipulates potentiated spines 
in response to light. The DNA construct for 
AS-PaRacl encodes a light-activatable version 
of the small signalling protein Racl, whose 
prolonged activity induces spines to shrink. 
The construct also incorporates the dendrite- 
targeting sequence of the gene Arc, which is 
expressed rapidly and transiently in response 
to neuronal activity, ensuring that the probe 
moves to dendritic spines that are undergoing 
structural potentiation. The AS-PaRac1 opto- 
probe is the first optogenetic tool to enable the 
manipulation of potentiated spines. 

Hayashi-Takagi and colleagues expressed 
AS-PaRacl in the motor cortex of mice and 
trained the animals to run on an accelerat- 
ing rotating rod known as a rotarod. Light 


Blue light | 
Spine 


enlargement 


Spine 
formation 


Rotarod training 


Figure 1 | Inducing forgetting. A neuron receives excitatory signals from other neurons through 
dendritic spines. When a mouse learns a new task, such as running on an accelerating rotating rod 

(a rotarod), spines involved in learning this task become potentiated (new spines form and existing spines 
increase in size). Hayashi-Takagi et al.’ developed an ‘optogenetic construct’ based on a light-activatable 
form of the small signalling protein Racl, which targets recently potentiated dendritic spines. Blue 

light activates the modified Racl, which induces shrinkage of the spines. The authors found that spine 
shrinkage caused the mouse to forget the skill it had learnt, so it soon fell off the rotating rod. 


activation of AS-PaRac] in potentiated spines 
after learning caused the spines to shrink, 
disrupting the animals’ ability to run on the 
rotarod. This demonstrates the causal rela- 
tionship between synaptic strength and motor 
memory in this context (Fig. 1). 

Next, the authors showed that the effect of 
the probe is task-specific. When mice learnt to 
run on the rotarod and then learnt to walk on 
a thin beam, disrupting the spines that were 
potentiated during beam walking did not affect 
performance on the rotarod. Furthermore, 
AS-PaRacl activation in spines that spontane- 
ously potentiated two days after learning (pre- 
sumably because of unrelated motor tasks) did 
not affect motor performance. Finally, when 
the authors retrained mice on the same task 
for which spine potentiation had been dis- 
rupted, most of the optically shrunken spines 
reverted to their original potentiated sizes. 
Together, these results suggest that distinct 
subsets of synapses are altered in a task-spe- 
cific way during motor learning and memory 
formation. 

In the long quest for the engram, neuro- 
scientists have reached the consensus that the 
mammalian brain stores different memory 
traces in different subsets of neurons in spe- 
cific regions. Methods for labelling, imaging, 
activating and silencing neurons in animals 
have enabled researchers to map the ensemble 


324 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


of neurons that correlates with a particular 
learning task, to manipulate their activities, and 
even to generate artificial memory traces*®. 
However, a single neuron may participate in 
the processing and storage of more than one 
distinct piece of information’. Therefore, the 
engram of a particular memory involves not 
only the identity of the constituent neurons, 
but also the entire set of synaptic connections 
between these neurons. How memory is allo- 
cated at this synaptic level remains unclear. 

To qualify as an engram, a synaptic circuit 
should satisfy several criteria. First, changes in 
synaptic structures and function should corre- 
late with learning. Second, blocking such syn- 
aptic modifications should prevent memory 
formation, demonstrating the need for these 
changes. And third, artificially inducing syn- 
aptic changes should be sufficient to produce 
a memory without the need for behavioural 
training. Over the past decade, in vivo imag- 
ing has revealed® that the dynamic formation 
and elimination of dendritic spines correlates 
with motor-skill learning and memory. Now, 
Hayashi-Takagi and colleagues have taken 
the next step, by establishing necessity — 
they show that undoing the synaptic changes 
that accompany motor learning does indeed 
disrupt the memory. 

The development of genetic and optical 
tools such as AS-PaRacl promises to enable 


dissection of the finer details of the engram. 
The use of promoter sequences that drive the 
expression of target genes ina cell-type-specific 
manner, as well as connectivity-specific labelling 
methods’, can help to unravel the roles in learn- 
ing and memory of synaptic circuits formed by 
different types of neuron — revealing, for exam- 
ple, the relative contributions of excitatory and 
inhibitory neurons, or of neurons in different 
layers of the brain’s cortex. When we have a 
deeper understanding of the molecular signal- 
ling events that occur at synapses during mem- 
ory formation”, tools similar to AS-PaRacl 
can be devised to modulate other components 
of the molecular machinery. Improved micro- 
scopy techniques can already target individual 
neurons or synapses’, rather than manipulating 
a population of neurons as a whole. 


When used together, such technical advances 
will enable us to strengthen existing engrams, 
to facilitate the formation of new ones, and 
to generate synthetic memory traces at the 
synaptic level. We will then be able to study 
the interaction between different memory 
traces, as well as the mechanisms that trans- 
late an engram into behavioural outputs. These 
efforts should allow us to gain an under- 
standing of the intriguing phenomenon 
of memory simply by shining a light on its 
physical basis. m 


Ju Lu and Yi Zuo are in the Department of 
Molecular, Cell and Developmental Biology, 
University of California, Santa Cruz, 

Santa Cruz, California 95064, USA. 
e-mails: jlu39@ucsc.edu; yizuo@ucsc.edu 


Tens of thousands of 
atoms replaced by one 


Many catalysts comprise metal nanoparticles on solid supports. The discovery 
that single atoms of palladium anchored to a solid support also exhibit high 
catalytic activity might help to conserve the supply of this and related rare metals. 


JOHN MEURIG THOMAS 


he platinum-group metals — ruthenium, 
rhodium, palladium, osmium, iridium 
and platinum — are extensively used 
as catalysts in industries that produce com- 
pounds such as agrochemicals, dyestuffs and 
pharmaceuticals, and several of them are 
crucial components of catalytic converters in 
cars. But as demand for these relatively scarce 
metals increases, their future availability is a 
cause for concern. This would be dispelled 
if the metals could be used in an atomically 
dispersed state, rather than as nanoparticles 
containing up to 100,000 atoms, as is con- 
ventional. Writing in Angewandte Chemie, 
Vilé et al.’ report that individual atoms of 
palladium can be anchored to carbon nitride 
(C,N,), an easily prepared nanoporous solid’. 
The resulting materials are excellent, thermally 
stable catalysts for selective hydrogenation 
reactions, which facilitate the production of 
many organic substances, including polymers 
and biologically important compounds”. 
There are many examples of catalysts in 
which the active components are supported 
nanoparticles of platinum-group metals 
(PGMs) or gold (see refs 4-6, for example). But 
in several cases, it has long been suspected’? 
that the nanoparticles are unimportant, and 
that catalysis occurs at single-atom sites. 
Indeed, isolated metal atoms have previously 


Palladium atom 


Figure 1 | A single-atom palladium catalyst. 

Vilé et al.’ report that isolated palladium atoms 

ona solid support of carbon nitride (C;N,; 

carbon atoms, grey; nitrogen atoms, purple) act 

as catalysts for hydrogenation reactions. Strong 
bonds to the nitrogen atoms firmly anchor the 
palladium atoms in roughly triangular pores in the 
stacked, two-dimensional layers of the support. 
Only one layer is depicted, for simplicity. (Adapted 
from ref. 1.) 


NEWS & VIEWS | RESEARCH | 


1. Semon, R. Die Mneme als erhaltendes Prinzip im 
Wechsel des organischen Geschehens (Wilhelm 
Engelmann, 1904) 

2. Hayashi-Takagi, A. et al. Nature 525, 333-338 
(2015). 

3. Holtmaat, A. & Svoboda, K. Nature Rev. Neurosci. 
10, 647-658 (2009). 

4. Han, J.-H. et al. Science 323, 1492-1496 (2009). 

5. Garner, A. R. et al. Science 335, 1513-1516 
(2012). 

Ramirez, S. et al. Science 341, 387-391 (2013). 

Jia, H., Rochefort, N. L., Chen, X. & Konnerth, A. 

Nature 464, 1307-1312 (2010). 

Chen, C.-C., Lu, J. & Zuo, Y. Front. Neuroanat. 8, 28 

(2014). 

9. Luo, L., Callaway, E. M. & Svoboda, K. Neuron 57, 

634-660 (2008). 

10.Mayford, M., Siegelbaum, S. A. & Kandel, E. R. 

Cold Spring Harb. Perspect. Biol. 4,a005751 (2012). 

-Packer, A. M., Russell, L. E., Dalgleish, H. W. P. & 

Hausser, M. Nature Methods 12, 140-146 (2015). 


ND 


go 


1 


— 


This article was published online on 9 September 2015. 


been manipulated as a strategy for enabling 
selective catalytic hydrogenations”’: atomically 
dispersed palladium atoms on the surfaces of 
a copper crystal stimulate local breaking of 
the bonds in hydrogen molecules, and the 
resulting hydrogen atoms become mobile 
on the copper surface, readily reacting with 
unsaturated molecules such as acetylene and 
styrene. However, in that system, the single 
atoms are laid down on the copper by heat- 
ing a palladium source in a high-vacuum 
chamber using an electron beam. This method 
is suitable for preparing single-atom catalysts 
of other PGMs, but does not readily translate 
to the production of industrial-scale quantities 
of catalysts. 

In their study, Vilé and colleagues propose 
that the catalytically active individual 
palladium atoms are tenaciously attached 
to the nitrogen atoms of the C,N, support 
(Fig. 1), owing to the lone pair of electrons 
that each nitrogen atom has"’. The authors’ 
X-ray-absorption studies found no evidence of 
palladium-—palladium bonds, indicating that 
the atoms are indeed separate from each 
other. The researchers also studied their 
samples using a technique called annular 
dark-field electron microscopy”, which 
takes advantage of the Rutherford scatter- 
ing of electrons’ (scattering at large angles) 
to detect heavy atoms of PGMs on the light 
elements of C,N,. These experiments identi- 
fied only single palladium atoms in the 
active catalyst. 

Vilé and co-workers’ catalysts are particu- 
larly notable because reproducible, thermally 
stable single-atom preparations can be readily 
made, provided that care is taken to incorpo- 
rate only small amounts of the palladium on 
the nanoporous support. Moreover, C,N, is 
inexpensive and may be routinely prepared 
in a graphite-like form”" that has relatively 
widely separated layers, thereby increasing 
the accessibility of the anchored palladium 
atoms to reactants. The authors report that 
it also has the merit of a high surface area 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 325 


© 2015 Macmillan Publishers Limited. All rights reserved 


| RESEARCH | NEWS & VIEWS 


(about 150 square metres per gram), which 
maximizes catalytic performance. 

The main hydrogenation reaction studied 
by the authors was the conversion of 1-hexyne 
to 1-hexene, in which carbon-carbon triple 
bonds are selectively converted to double 
bonds, but not further to single ones. Such 
selective hydrogenations have convention- 
ally used a Lindlar catalyst’, which con- 
sists of nanoparticles of a palladium-lead 
compound”* on a calcium carbonate support. 
The authors’ single-atom palladium catalyst 
enables much higher yields and faster reactions 
than either a Lindlar catalyst or hydrogenation 
catalysts based on nanoparticles of platinum 
or gold”. It also yields 1-hexene with greater 
than 99% selectivity. Moreover, after repeated 
use (five successive tests), the catalyst displays 
no decrease in selectivity nor in the fraction 
of 1-hexyne converted to 1-hexene. Finally, 
Vilé and colleagues report that their single- 
atom catalyst enables the hydrogenation of 
nitrobenzene to form aniline exclusively. This 
kind of reaction is used to make compounds 
for the dyestuffs industry and key intermedi- 
ates in the manufacture of agrochemicals and 
pharmaceuticals. 

Single-atom solid catalysts are of consider- 
able interest, from both a practical”’*”” and 
a theoretical” perspective, not only because 
selective hydrogenations are among the most 
valuable conversions in industrial chemistry, 
but also because these reactions are atom effi- 
cient’: the minimum number of atoms is used 
in each reaction, reducing waste. Such catalysts 
can also be used for other reactions. For exam- 
ple, recent work'® shows that single atoms of 
platinum function as atom-efficient catalysts 
for the water-gas shift reaction, which is used 
to generate pure hydrogen for the synthesis of 
ammonia. Single-atom platinum catalysts have 
also been used for the selective hydrogenation 
of nitroaromatic compounds’. 

Readily prepared single-atom platinum 
catalysts’ supported on iron oxide (FeO,) have 
been reported to be much more active, selec- 
tive and durable than analogous nanoparticle 
platinum catalysts. Remarkably, single atoms 
of platinum supported on FeO, are chemo- 
selective in the hydrogenation reactions that 
they catalyse — that is, they can discriminate 
between two or more regions of a molecule 
that could potentially react with hydrogen. 
For example, the catalysts convert nitro groups 
(NO,) to amino groups (NH,), but leave 
carbonyl groups (C=O) and benzene rings 
untouched. In one such reaction, a single- 
atom platinum catalyst displayed a turnover 
frequency (the number of reactant molecules 
converted to product per unit of time) of 
1,500 per hour, which is 20 times as high as the 
previous best result reported in the literature. 
The selectivity for the substrate of that reaction 
was about 99%, the highest reported for any 
PGM catalyst. 

The future looks bright for the use of PGMs 


as catalysts, both on laboratory and industrial 
scales, because the preparation of most kinds 
of single-atom metal catalyst is likely to be 
straightforward, and because characteriza- 
tion of such catalysts has become easier with 
the advent of techniques that readily discrimi- 
nate single atoms from small clusters and 
nanoparticles. A prerequisite is to find ways 
of securely anchoring single atoms of these 
expensive metals to high-surface-area, cheap 
and plentiful solids composed of elements 
that are abundantly available, as Vilé et al. and 
others'® have done. If this can be achieved gen- 
erally, then the future deployment of PGMs in 
solid catalysts will be transformed. m 


John Meurig Thomas is in the Department 
of Materials Science and Metallurgy, and 

at Peterhouse, University of Cambridge, 
Cambridge CB3 OFS, UK. 

e-mail: jmt2@cam.ac.uk 

1. Vilé, G. et al. Angew. Chem. Int. Edn 54, 11265-11269 


(2015). 
2. Goettmann, F., Fischer, A., Antonietti, M. & Thomas, A. 


EVOLUTIONARY BIOLOGY 


Angew. Chem. Int. Edn 45, 4467-4471 (2006). 
3. Thomas, J. M., Johnson, B. F. G., Raja, R., Sankar, G. 
& Midgley, P. A. Acc. Chem. Res. 36, 20-30 (2003). 
4. Valdez, M., Lai, X. & Goodman, D. W. Science 281, 
1647-1650 (1998). 
5. Hughes, M. D. et a/. Nature 437, 1132-1135 
(2005). 
. Haruta, M. Faraday Disc. 152, 11-32 (2011). 
. Thomas, J. M. Design and Applications of Single 
Site Heterogeneous Catalysts (Imperial Coll. Press, 
2012). 
8. Fu, Q., Saltsburg, H. & Flytzani-Stephanopoulos, M. 
Science 301, 935-938 (2003). 
9. Wei, H. et al. Nature Commun. 5, 5634 (2014). 
0.Kyriakou, G. et al. Science 335, 1209-1212 (2012). 
1 Arrigo, R. et al. ACS Catal. 5, 2740-2753 (2015). 
2.Krivanek, O. L. et al. Nature 464, 571-574 (2010). 
3.Midgley, P. A., Weyland, M., Thomas, J. M. & 
Johnson, B. F. G. Chem. Commun. 907-908 (2001). 
4.Groenewolt, M. & Antonietti, M. Adv. Mater. 17, 
1789-1792 (2005). 
5. Lindlar, H. Helv. Chim. Acta 35, 446-450 (1952). 
6.Palczewska, W., Jabtonski, A., Kaszkur, Z., Zuba, G. & 
Wernisch, J. J. Mol. Catal. 25, 307-316 (1984). 
7.Serna, P., Boronat, M. & Corma, A. Top. Catal. 54, 
439-446 (2011). 
8.Yang, M. et al. J. Am. Chem. Soc. 137, 3470-3473 
(2015). 
9.Thomas, J. M. Phil. Trans. R. Soc. A (in the press). 
20.Peters, B. & Scott, S. L. J. Chem. Phys. 142, 104708 
(2015). 


NO 


Perplexing effects of 
phenotypic plasticity 


Research on guppies provides evidence that phenotypic plasticity — an organism’s 
ability to alter its characteristics in response to changes in the environment — can 
both constrain and facilitate adaptive evolution. SEE LETTER P.372 


JUHA MERILA 


Itered or new environmental 

conditions, such as those brought 

about by climate change, are impor- 
tant sources of selection pressures that drive 
organismal adaptation and evolution. But 
alongside genetic adaptation, organisms 
can respond to environmental challenges 
through adaptive phenotypic plasticity, which 
refers to a non-genetic shift in the average 
characteristics (phenotype) of a population 
towards an evolutionary optimum. Whether 
phenotypic plasticity generally facilitates or 
constrains adaptive (genetic) evolution remains 
a contentious issue’ *. On page 372 of this 
issue, Ghalambor et al.” provide experimental 
evidence from guppies suggesting that adap- 
tive phenotypic plasticity in gene-expression 
patterns constrains evolution. But they also 
find that non-adaptive plasticity — pheno- 
typic changes that do not directly contribute 
to increased fitness under the changed con- 
ditions — may facilitate adaptive genetic 
change by increasing the strength of natural 
selection. 


326 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


The authors’ experiments involved trans- 
planting wild Trinidadian guppies (Poecilia 
reticulata; Fig. 1) from a stream that also 
hosted predatory cichlid fish into two replicate 
streams without cichlids. They then compared 
patterns of brain gene expression between the 
introduced and original (ancestral) popula- 
tions after three or four generations. Parallel 
changes in gene expression had occurred for 
135 genes in the two introduced populations, 
and these new levels of gene expression were 
similar to those exhibited by a native cichlid- 
free population. This suggested rapid adaptive 
evolution in the introduced populations. 

However, the evolved differences were 
mostly (89% of the genes) in the opposite 
direction to that of phenotypic plasticity in 
expression patterns in the ancestral popula- 
tion. This was inferred by comparing the gene 
expression in ancestral fish reared in either 
the presence or absence of chemical cues 
from predatory cichlids. Thus, the pheno- 
typic plasticity in these genes can be consi- 
dered non-adaptive. The remaining 11% of 
genes exhibited adaptive plasticity — the 
evolved differences in gene expression in the 


GERARD LACZ/REX/SHUTTERSTOCK 


3 ~ N = 
Figure 1 | Trinidadian guppies. Ghalambor et al.° show that high levels of non-adaptive phenotypic 


= ee 


plasticity in a source population seem to facilitate rapid adaptive evolution of gene-expression patterns 


in guppies transplanted into a different environment. 


experimentally introduced populations were 
concordant with the direction of change of 
expression levels in ancestral fish raised in 
the absence of predatory-fish cues. The 
authors also observed that there was little or 
no population divergence in the expression 
of these genes in either of the introduced 
populations. 

The latter findings support evolutionary 
models predicting that adaptive phenotypic 
plasticity should weaken the strength of direc- 
tional selection and thereby slow the rate of evo- 
lution (see refs 6 and 7 for examples). However, 
the real stunner of the study was the discovery 
that most of the evolved (genetic) differences 
in gene-expression patterns in the introduced 
guppy populations had taken place in the oppo- 
site direction to the direction of plasticity in the 
ancestral population. This inverse relationship 
between the direction of plasticity and the direc- 
tion of adaptive evolution suggests that non- 
adaptive plasticity may facilitate (in the authors’ 
words, potentiate) evolution by increasing the 
strength of directional selection required to 
create the observed divergence in gene- 
expression patterns. 

The authors obtained support for the 
hypothesized increase in directional selection 
against non-adaptive plasticity by examin- 
ing evolutionary changes in the magnitude of 
plasticity (quantified as the mean difference 
in expression levels of gene transcripts in the 
predator-cue-treated groups) between ances- 
tral and introduced populations. Although 
phenotypic plasticity is, by definition, a 


non-genetic response to environmental cues, 
the capacity to express it, and its magnitude, 
can be genetically variable’*. Consequently, if 
directional selection had acted most strongly 
on gene transcripts exhibiting non-adaptive 
plasticity, then the magnitude of plasticity in 
introduced populations in response to this 
selection should be reduced. This was just 
what Ghalambor et al. observed. Moreover, 
the decline in the magnitude of plasticity in 
the introduced populations was inversely pro- 
portional to plasticity in the ancestral popula- 
tion. This also aligns with the expectation that 
transcripts exhibiting the greatest non-adap- 
tive plasticity should be the ones that are most 
strongly selected against. 

Although the findings that phenotypic 
plasticity can both constrain and facilitate 
evolutionary (genetic) adaptation are not 
unprecedented, several features of Ghalambor 
and colleagues’ study set it apart from earlier 
work on this topic. For instance, instead of 
focusing on a limited number of traits, the 
authors assessed the plasticity of a large num- 
ber of traits (expressed genes), which allowed 
them to draw robust quantitative conclusions. 
Nevertheless, a question to be addressed is 
whether results from gene-expression analy- 
ses can be extended and generalized to macro- 
scopic traits that have more-direct ecological 
relevance. Similarly, most previous empirical 
studies that focused on the direction of plas- 
tic responses and the direction of subsequent 
evolutionary divergence in wild populations 
have been limited to comparisons between 


NEWS & VIEWS | RESEARCH | 


ancestral and derived populations long after 
they diverged. The new study’s focus on initial 
patterns of plasticity and subsequent rapid 
adaptive divergence in the wild provides a 
thought-provoking complement to laboratory 
experiments that have provided evidence sup- 
porting both positive (adaptive)*” and nega- 
tive (non-adaptive)"” relationships between the 
directions of plastic responses and evolution. 

Ghalambor and colleagues’ results are also 
intriguing because most (but not all) attempts 
to model the effects of plasticity on subse- 
quent evolution have assumed it to be adap- 
tive. Thus, the observed negative relationship 
between the direction of plasticity and the 
direction of evolution in guppies may guide 
future theoretical work in the field. Further- 
more, although increased strength of selec- 
tion caused by non-adaptive plasticity may 
contribute to rapid adaptation and increase 
the likelihood of population persistence, it 
may also lead to reduced population size and 
an increased risk of demographic collapse’. 
By reducing population size, selection stem- 
ming from non-adaptive plasticity may expose 
a population to an increased rate of random 
genetic changes owing to a process known 
as genetic drift. This would in turn propagate 
loss of genetic variation and reduced efficiency 
of selection, counteracting the proposed 
benefit from non-adaptive plasticity. 

As fascinating as it is to suggest that 
maladaptive plasticity may be a strong driver 
of evolution, sceptics may require further 
experimental studies from the wild with more 
population replicates and with a focus on traits 
with established ecological relevance (such as 
behaviours and morphology) to be convinced. 
Such studies would also be helpful, if not essen- 
tial, in developing parameters for models that 
aim to understand how the interplay between 
phenotypic plasticity, natural selection and 
random genetic drift influences evolutionary 
changes. = 


Juha Merila is in the Department of 
Biosciences, University of Helsinki, 
00014 Helsinki, Finland. 

e-mail: juha.merila@helsinki.fi 


1. West-Eberhard, M. J. Developmental Plasticity and 
Evolution (Oxford Univ. Press, 2003). 

2. Price, T.D., Qvarnstrom, A. & Irwin, D. E. Proc. R. Soc. B 
270, 1433-1440 (2003). 

3. Chevin, L.-M., Lande, R. & Mace, G. M. PLoS Biol. 8, 
e1000357 (2010). 

4. Pfennig, D. W. et al. Trends Ecol. Evol. 25, 459-467 
(2010). 

5. Ghalambor, C. K. et al. Nature 525, 372-375 
(2015). 

6. Ancel, L. W. Theor. Popul. Biol. 58, 307-319 
(2000). 

7. Paenke, |., Sendhoff, B. & Kawecki, T. J. Am. Nat. 
170, E47-E58 (2007). 

8. Waddington, C. H. Adv. Genet. 10, 257-293 (1961). 

9. Suzuki, Y. & Nijhout, H. F. Science 311, 650-652 
(2006). 

10.Schaum, C. E. & Collins, S. Proc. R. Soc. B 281, 
20141486 (2014). 


This article was published online on 2 September 2015. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 327 


© 2015 Macmillan Publishers Limited. All rights reserved 


| RESEARCH | NEWS & VIEWS 


Repositioned to kill 


stem cells 


Chemotherapy- resistant cancer stem cells make it hard to cure many forms 
of the disease. Repositioning an existing drug to tackle this problem could 
significantly improve treatment for one form of leukaemia. SEE LETTER P.380 


TESSA HOLYOAKE & DAVID VETRIE 


(CML), a daily oral medication can rapidly 
transform a progressive and ultimately fatal 
cancer into a chronic but manageable condi- 
tion. But this is not a cure. The persistence of 
quiescent (dormant, non-cycling) and thus 
drug-resistant leukaemic stem cells (LSCs) 
poses an unmet clinical challenge, and any 
attempt to cure CML must target the eradi- 
cation of these cells. In this issue, Prost et al.’ 
(page 380) present provocative preclinical 
and early clinical findings demonstrating that 
a drug currently used for diabetes therapy can 
be repositioned to target a pathway that con- 
trols quiescence in LSCs, causing the gradual 
erosion of this cellular pool. 
The cause of CML is a mutation in a 
normal blood stem cell involving an exchange 


I n most cases of chronic myeloid leukaemia 


Pioglitazone 
(diabetes) 


Gn) — Ge 


Cycling leukaemic 
stem cell 


| 


Axitinib (renal cancer) 


as 


of genetic material between chromosomes 9 
and 22. This translocation creates a cancer- 
driving gene known as BCR-ABL1, which 
produces a protein with enhanced activ- 
ity as a tyrosine kinase enzyme, leading to 
uncontrolled cell proliferation. BCR-ABL1 has 
been shown to be sufficient to drive the devel- 
opment of leukaemia in mouse models’, and 
the discovery of this protein led to the develop- 
ment of tyrosine kinase inhibitors (TKIs) for 
CML treatment. 

In the past two decades, TKIs have dramati- 
cally improved the outcome for people with 
this cancer. Most of those who present with 
early disease respond rapidly to TKI therapy 
and go into long-lasting remission. How- 
ever, TKIs fail to eradicate LSCs, the cells that 
initiate and maintain CML, and these drug- 
resistant cells can drive relapse, or evolve to 
cause further forms of TKI resistance and 


ea 
HIF2a 


2) 


Dormant leukaemic 
stem cell 


} 


Arsenic trioxide 
(acute promyelocytic 
leukaemia) 


Hydroxychloroquine 
(malaria) 


Figure 1 | Targeting leukaemic stem cells in chronic myeloid leukaemia. Prost et al.' describe a 
molecular pathway, involving the receptor PPARy, the transcription factors STAT5 and HIF2a, and the 
regulatory protein CITED2, that induces leukaemic stem cells (LSCs) to enter a dormant (quiescent) 
state. They also show that the drug pioglitazone, approved for diabetes treatment, activates PPARy to 
block this pathway, and can kill these cells when used in conjunction with tyrosine kinase inhibitors 
(TKIs), which inhibit the protein BCR-ABL1 and thus STATS, and which are the standard therapy 
against active (cycling) leukaemic cells. Several drugs used to treat other diseases, such as axitinib, arsenic 
trioxide and hydroxychloroquine, have also been repositioned to treat chronic myeloid leukaemia, but 


these have different mechanisms of action. 


328 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


more-aggressive disease. Asa result, people on 
life-long TKI therapy are exposed to asso- 
ciated, often serious, side effects and may 
cease to respond to the treatment at any time. 
Furthermore, the significantly improved 
survival for those taking TKIs means that 
the prevalence of CML is increasing each 
year, with inherent social and economic 
implications. 

Several potential mechanisms to explain 
the insensitivity of LSCs to TKIs have been 
proposed, including cellular quiescence. 
Prost et al. report that quiescence in LSCs is 
regulated by a pathway involving the recep- 
tor PPARy, the transcription factors STAT5 
and HIF2a, and the protein CITED2, which 
is known’ to regulate blood stem-cell quies- 
cence (Fig. 1). A particular strength of the 
study was the use of primary blood stem cells 
(expressing the marker CD34) from people 
with CML to dissect the pathway and con- 
firm the role of each component in regulating 
LSC quiescence. 

The authors go on to show that combining 
imatinib, the standard TKI used to manage 
CML, with the antidiabetic agent pioglitazone, 
which activates PPARy, blocks this pathway in 
CML cells. The synergistic effects of the drugs 
reduce STATS expression and activity, down- 
regulate HIF2a and CITED2 expression, and 
trigger the death of quiescent LSCs. Although 
the mechanism by which LSCs are killed 
in response to this drug combination is not 
clear, they are probably either killed directly 
or driven to exit quiescence, which may lead 
to their eradication by the TKI. The authors 
also demonstrate that the compound JQ1, a 
bromodomain inhibitor with broad activity 
that includes the suppression of STATS activ- 
ity, is as effective as pioglitazone (in combi- 
nation with imatinib). Although this finding 
supports a role for the STAT5 pathway in LSC 
quiescence, the door is still open for studies of 
other agents that may target LSCs through this 
or alternative pathways. 

Collectively, these results strengthen 
the concept that cancer stem cells exhibit 
vulnerabilities in otherwise normal molecular 
pathways that may be targeted in a selective 
manner to obtain a cure. Earlier work dem- 
onstrated that CML stem-cell quiescence is 
in part maintained by the promyelocytic 
leukaemia tumour-suppressor protein, which 
can be targeted by arsenic trioxide’, and that 
the cellular process of autophagy functions 
as a survival pathway for CML stem cells that 
can be targeted by repositioning the anti- 
malarial agent hydroxychloroquine’ (Fig. 1). 
Both of these approaches are currently 
under investigation in the clinic. 

Prost et al. also tested the addition of 
pioglitazone to imatinib therapy in three peo- 
ple with CML, and found that they converted 
from having demonstrable residual leukae- 
mia to being disease-free. The effect lasted for 
months to years after pioglitazone treatment 


ceased. These data provided a strong rationale 
for a phase II clinical trial, which started in July 
2009 (ACTIM EudraCT 2009-011675-79). 
Although the interim results from this trial are 
encouraging, the study is non-randomized, 
so it will be difficult to ascertain definitively 
that improved response rates are driven by 
pioglitazone. 

Despite the need for further clinical testing 
of this combination therapy, Prost et al. have 
demonstrated the substantial potential for 
drug repositioning in CML research. Their 
results follow a recent report® in which axitinib, 
a TKI approved for the treatment of drug- 
resistant renal-cell cancer, was repositioned to 
tackle TKI resistance in CML. Using drugs that 
have already been approved for other purposes 
can shorten the drug-development pathway by 
5-10 years and reduce risks and costs. 


CONDENSED-MATTER PHYSICS 


Although drug repositioning can be rather 
serendipitous, Prost and colleagues had a 
tangible rationale that PPARy activators such 
as pioglitazone warranted investigation in 
CML on the basis of their observation’ of the 
drugs’ activity against a cell-line model of the 
disease. Already around 30% of drugs newly 
approved for a particular treatment have been 
repositioned from another therapy, and such 
hypothesis-driven repositioning strategies are 
likely to become more common in cancer drug 
discovery. This figure is set to rise further as 
our understanding of cellular pathways and 
processes increases and we include innova- 
tive computational approaches to facilitate 
disease-, drug- and treatment-oriented drug 
repositioning. It is clear that reposition- 
ing will increasingly help the fast-tracking 
of drugs into the clinic. As demonstrated by 


Charge topology in 
superconductors 


X-ray images of cuprate superconductors reveal the fractured, defect-riddled 
backbone on which superconductivity develops. The results take us a step closer 
to understanding how supercurrent flows on small spatial scales. SEE LETTER P.359 


ERICA W. CARLSON 


he quantum motion of electrons 

enforces a high degree of homogeneity 

in conventional materials such as metals 
and semiconductors. As a result, the electrons 
spread out evenly in these materials, like liq- 
uid filling a container. By contrast, nanoscale 
images of copper oxide (cuprate) superconduc- 
tors have revealed that the materials’ electrons 
form clumps at the surface’. On page 359 of 
this issue, Campi et al.” report X-ray images of 
a cuprate superconductor, revealing complex 
patterns of electrons’ that are also scaffolded 
throughout the interior of the material, and 
ona much larger scale than has been observed 
before. Just like the skeleton of a coral reef, 
where, the greater the scale on which the reef is 
observed, the more complexity meets the eye, 
the electrons in these materials form structures 
full of gnarled hollows of varying size. 

Campi and colleagues find that the size of 
the patterns formed by electrons is strongly 
tied to the degree to which the cuprate super- 
conductor is doped — meaning that a small 
amount of one type of atom is substituted 
for another, to change the charge available 
for conducting current through the mater- 
ial. Materials that have a uniform distribu- 
tion of electrons, such as semiconductors, are 
typically robust against spatial variations in 


doping level. Nanostructures are a notable 
exception: variations in doping level affect the 
performance of the smallest semiconductor 
devices. The electrons inside cuprate com- 
pounds that superconduct at high tempera- 
tures (up to 160 kelvin) spontaneously form 
nanostructures, and so these materials are 
sensitive to local doping variations. 
The authors made their discovery using 
a technique called scanning micro X-ray 
diffraction. In this 


The disordered approach, as high- 
states that eens hehe moves 
the authors t nea ae 
discoveredmight ‘“ a hes a 2 
be exploited angle that depends 
ti se lat on the periodi- 
ae ewe city of local charge 

P ty. variations in the 


material. It was 
already known that the charge in cuprate 
superconductors is often locally ordered 
into a unidirectional pattern, with a periodi- 
city of about four crystalline unit cells (the 
smallest periodically repeating structures in 
a crystal), so that the electron-density dis- 
tribution resembles striped wallpaper”. 
Campiet al. scan a micrometre-sized X-ray 
beam across a superconducting sample to 
probe how the character of this stripy charge- 
density wave varies from spot to spot in the 


NEWS & VIEWS | RESEARCH | 


Prost and colleagues, this could soon signal the 
beginning of the end for stem-cell quiescence 
in CML and other cancers. = 


Tessa Holyoake and David Vetrie are at 
the Institute of Cancer Sciences, University 
of Glasgow, Glasgow G12 0ZD (T.H.) and 
Glasgow G61 1QH (D.V.), UK. 

e-mails: tessa.holyoake@glasgow.ac.uk; 
david. vetrie@glasgow.ac.uk 


. Prost, S. et al. Nature 525, 380-383 (2015). 

. Daley, G. Q., Van Etten, R. A. & Baltimore, D. Science 
247, 824-830 (1990). 

. Kranc, K. R. et al. Cell Stem Cell 5, 659-665 (2009). 

. Ito, K. et al. Nature 453, 1072-1078 (2008). 

. Bellodi, C. et al. J. Clin. Invest. 119, 1109-1123 (2009). 

. Pemovska, T. et al. Nature 519, 102-105 (2015). 

. Prost, S. et al. J. Clin. Invest. 118, 1765-1775 (2008). 


Ne 


NOOB W 


This article was published online on 2 September 2015. 


material. But instead of a single sheet of ‘striped 
wallpaper’, they find rips, tears and patches 
in the electronic texture, as though someone 
had papered a wall with sheets of many dif- 
ferent sizes and shapes, and with complete 
disregard for whether the borders matched up. 

As in nanostructure semiconductors, these 
electronic textures in cuprate superconduc- 
tors are sensitive to local variations in the 
doping level on nanometre and micrometre 
scales. For example, the lower the level of local 
oxygen doping, the higher is the contrast of 
the charge-density waves (bright wallpaper 
stripes), introducing greater disorder into the 
overall charge pattern. 

These effects are important for super- 
conductivity because of two factors: dimen- 
sionality and connectivity. The behaviour of 
electrons ultimately depends on the shape of 
their quantum-mechanical waves, and waves 
show vastly different behaviour in different 
dimensions. In three dimensions, the energy 
carried by a wave over a distance r spreads 
out as r-*, as in sound waves coming from a 
speaker. In two dimensions, such as in ripples 
emanating from a pebble thrown into a pond, 
the distance dependence changes to r"'. In one 
dimension, waves cannot dissipate by spread- 
ing out. Like the bow waves of canal tugboats, 
there is only one way for a wave to go in one 
dimension: forward. The charge-density 
structures that Campi et al. find are enticing, 
because they effectively reduce the dimension- 
ality that electrons can explore, which can lead 
to mechanisms of superconductivity that are 
fundamentally different from those of conven- 
tional superconductors®”. 

However, the dimensionality that Campi 
et al. infer is not integer. They find that the 
sizes of the patches formed by the charge- 
density waves are distributed according to a 
power law, which is typical of fractal dimen- 
sions. Like any good fractal, these patterns 
display similarities whether they are observed 
from close up or far away. Although much 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 329 


© 2015 Macmillan Publishers Limited. All rights reserved 


| RESEARCH | NEWS & VIEWS 


theoretical effort has gone into understanding 
electrons in three, two and one dimensions, we 
knowlittle about the behaviour of electrons in 
fractal dimensions. 

The other key ingredient of the effects that 
the authors observe is connectivity. Macro- 
scopic superconductivity is ultimately a 
charge-transport phenomenon (for electric- 
ity to flow, electrons must be transported from 
one side of the sample to the other), and this 
transport is dominated by connectivity. With- 
out connections between different domains, 
supercurrent cannot flow through the sample, 
and the material fails to be a practical, bulk 
superconductor. 

A disordered charge distribution can there- 
fore be devastating to connectivity in super- 
conducting materials. One crucial connection 
erased by disorder can unlink an entire sys- 
tem. Disorder can also affect the nature of the 
changes in physical properties that accom- 
pany the onset of phase transitions (including 
superconductivity), by smearing out an abrupt 


ATMOSPHERIC SCIENCE 


transition, lowering the temperature at which 
it happens or changing the geometry of the 
fractal charge distribution associated with a 
smooth phase transition. Such disorder can 
make it much harder for a system to equili- 
brate, causing the changes in its properties to 
lag in response to external inputs (hysteresis) 
and to be dependent on past inputs (memory 
effects). However, these effects can also be 
turned into an opportunity to control domain 
morphology through system-training proto- 
cols®, much like the way in which commercial 
permanent magnets are prepared in a mag- 
netized state. This means that the disordered 
states that the authors have discovered might 
be exploited to manipulate superconductivity 
along similar lines. 

One limitation of Campi and colleagues’ 
study is that they did not directly observe the 
morphology of the path that the supercon- 
ducting electrons take. Rather, they inferred it 
from the morphology of the observed variable 
charge distribution. More data are needed to 


The death toll from 
air-pollution sources 


Estimates of worldwide deaths associated with exposure to fine particles in 
atmospheric pollution provide some surprising results. The findings will guide 
future research and act as a wake-up call for policymakers. SEE LETTER P.367 


MICHAEL JERRETT 


estimate the number of worldwide deaths 

each year caused by seven sources of air 
pollution. To do this, they used advanced 
global atmospheric-chemistry models, 
detailed country-level population and health 
data, and integrated exposure-response (IER) 
functions — statistical models that describe 
how mortality varies with exposure to fine 
particulate air pollution. The atmospheric- 
chemistry model allowed the researchers to 
attribute air pollution and premature deaths in 
different regions to emissions associated with 
various sectors of the economy. 

More than 3.2 million deaths per year 
have been attributed’ to exposure to outdoor 
particulate matter known as PM, ; — particles 
less than 2.5 micrometres in diameter, which 
can penetrate deep into the lungs and cause 
a wide range of health problems. Many parts 
of the United States and Europe have seen 
substantial improvements in air quality over 
recent decades as a result of regulatory inter- 
ventions, and growing evidence™ suggests that 
these improvements benefit public health. But 


IE this issue, Lelieveld et al. (page 367) 


other regions, particularly countries in Asia 
with vast populations, continue to have poor 
air quality” (Fig. 1), with the emissions of sev- 
eral key pollutants expected to increase in the 
future®. The overlap of high pollution and large 
populations takes a huge toll on public health, 
but little is known about the pollution sources 
that are responsible for premature deaths. 
Enter Lelieveld and colleagues. The authors’ 
results are surprising and potentially impor- 
tant for protecting public health globally. 
First, they estimate that ambient PM, , from 
commercial and residential energy sources 
contributes the most to premature deaths 
worldwide. These sources include solid fuel 
such as coal and biomass used for heating 
and cooking, local waste disposal and diesel 
generators. Such sources account for 32% of 
the premature deaths in China and 50-70% 
of those in India and other Asian nations. 
The IER functions’ that the authors used 
pool epidemiological exposure-response 
information for mortality associated with 
exposure to outdoor particles, emissions from 
biomass burning, and tobacco smoke (both 
from active smoking and second-hand expo- 
sure). For deaths attributable to stroke and 


330 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


probe that intermediate length between the 
nanometre and macroscopic length scales, so 
as to chart the true path of the superconducting 
electrons. Future work should also investigate 
how the spatial pathways of superconductivity 
are affected by the complex interplay between 
disordered electron distributions and charge- 
density waves. m 


Erica W. Carlson is in the Department of 

Physics and Astronomy, Purdue University, 
West Lafayette, Indiana 47907-2036, USA. 
e-mail: ewcarlson@purdue.edu 


. Kohsaka, Y. et a/. Science 315, 1380-1385 (2007). 

. Campi, G. et a/. Nature 525, 359-362 (2015). 

. Dagotto, E. Science 309, 257-262 (2005). 

. Tranquada, J. M., Sternlieb, B. J., Axe, J. D., Nakamura, Y. 

& Uchida, S. Nature 375, 561-563 (1995). 

5. Comin, R. et al. Science 343, 390-392 (2014). 

6. Emery, V. J., Kivelson, S. A. & Zachar, O. Phys. Rev. B 
56, 6120 (1997). 

7. Senthil, T. & Fisher, M. P. A. Phys. Rev. Lett. 86, 
292-295 (2001). 

8. Carlson, E. W. & Dahmen, K. A. Nature Commun. 2, 

379 (2011). 


BWHHE 


cardiovascular disease, the IER curve is steeper 
at low exposures (implying that the mortal- 
ity effects of increases in PM, ; are greater at 
lower particulate levels), but generally flattens 
at higher exposures. Large uncertainties in the 
IER for PM, ; occur in the exposure range of 
approximately 30-100 micrograms per cubic 
metre (ref. 7), because no information for 
cardiovascular mortality due to outdoor PM, ; 
is available, and because only a few studies of 
second-hand smoke exposure exist. A caveat to 
Lelieveld and colleagues’ estimates of prema- 
ture deaths from commercial and residential 
energy sources in Asian countries is that they 
fall mostly in these areas of high uncertainty. 

Studies of the effects of biomass burning on 
cardiovascular disease or stroke at any level of 
exposure are also lacking’. Furthermore, the 
largest study so far to examine how sources of 
fine-particle air pollution affect heart-disease 
mortality’ found no effects for ambient PM, ; 
from biomass burning in the United States. 
Nevertheless, as the authors point out, even 
if it is assumed that biomass burning and 
commercial and residential energy use do not 
contribute to mortality associated with heart 
disease, such energy use remains the largest 
factor for global mortality associated with 
air pollution overall, even though the total 
number of deaths declines. 

Lelieveld and colleagues’ next major finding 
is that agricultural sources are the second- 
largest contributor to global mortality from 
PM, , — releases of ammonia from livestock 
and fertilizers lead to atmospheric formation 
of ammonium nitrate and sulfate particles. 
Agricultural sources are the leading source of 
mortality in the eastern United States, Russia, 
Turkey, Korea, Japan and Europe, contribut- 
ing to more than 40% of the deaths in many 
European countries. 


NOAH SEELAM/AFP/GETTY 


+ | 
7 


+ 


\ oo 


NEWS & VIEWS | RESEARCH | 


Figure 1 | Burning waste in India. Lelieveld et al.’ estimate that fine particles generated from commercial and residential energy use, including waste burning, 
contribute the most to pollution-associated premature deaths globally, especially in India and other Asian countries. 


This finding assumes that ammonium 
nitrate and sulfate have the same toxicity as 
other constituents of the atmospheric parti- 
cle mixture. Some epidemiological studies” 
do indeed report adverse effects from these 
particles, but many toxicological data indi- 
cate that they have little biological potency at 
ambient levels’. The contradictory evidence 
for ammonium sulfate probably arises because 
these particles are often mixed with metals and 
other toxic components from coal or industrial 
sources’’. It could therefore be that Lelieveld 
et al. overestimate the effects of particles from 
agricultural sources. The finding is highly 
valuable, however, because agriculture has 
generally not been seen as a major source of 
air pollution or premature death, and because 
it suggests that much more attention needs to 
be paid to agricultural sources, by both scien- 
tists and policymakers. 

Third, the researchers find that traffic- 
related pollution accounts for about 20% of 
the deaths from PM, ; in the United States, the 
United Kingdom and Germany, but only 5% 
globally. The spatial resolution of their global 
assessment (which considers sub-areas of 
approximately 110 x 110km) cannot capture 
the effects of finer-scale variation of traffic 
pollution. Other studies'"” have found that var- 
iation in pollution about 50-500 metres from 
the roadside correlates with mortality. Mount- 
ing evidence’ also points to heightened effects 
on health and mortality from the components 
and reaction products of traffic emissions com- 
pared with other emission sources. Thus, the 
effects from traffic might be underestimated 


by Lelieveld and colleagues. But the findings 
send out two crucial messages: traffic emissions 
remain a major source of premature death in 
Western countries even after extensive regu- 
latory action, and the rapid rate of growth in 
traffic in many regions may well lead to 
increased pollution and more premature deaths 
in the near future. 

Finally, the authors project a doubling of 
mortality from air pollution by 2050 on the 
basis of projected rates of increase in pollution 
and population levels. This projection should 
sound alarm bells for public-health agencies 
around the world. It also raises the question 
of which sources should be reduced in dif- 
ferent regions. The answer depends on how 
much trust we put in the IER curve. Because 
the steep part of the curve is at lower levels of 
ambient PM, ., large benefits can accrue from 
relatively small reductions in air pollution in 
cleaner regions, whereas the flatness of the 
curve at high levels necessitates large reduc- 
tions in the polluted areas of Asia to achieve 
major health benefits”. 

Lelieveld and colleagues’ findings suggest 
that about 1 million lives could be saved every 
year by reducing ambient exposure to pollu- 
tion. A further 3.54 million lives per year could 
be saved by lowering indoor exposures from 
similar sources’, mainly through changes in 
commercial and residential energy use. Incen- 
tivizing the use of cleaner fuels or of electricity 
for local energy needs would reduce mortality 
from both indoor and ambient PM, ; expo- 
sure and should be a priority in Asia and other 
regions that rely on solid fuels. For many parts 


of the world, more research is needed if we 
are to understand the impacts of agricultural 
practices on air pollution and mortality, and 
especially to determine the toxicity of ammo- 
nium nitrate and sulfate emanating from this 
source. And in countries that already have low 
ambient levels of pollution, sizeable benefits 
can still be achieved by reducing emissions 
from fossil-fuel power plants and traffic. m 


Michael Jerrett is in the Department of 
Environmental Health Sciences, and at the 
Center for Occupational and Environmental 
Health, Fielding School of Public 

Health, University of California, Los Angeles, 
Los Angeles, California 90095. 

e-mail: mjerrett@ucla.edu 


1. Lelieveld, J., Evans, J. S., Fnais, M., Giannadaki, D. & 

Pozzer, A. Nature 525, 367-371 (2015). 

2. Lim, S.S. et al. Lancet 380, 2224-2260 (2012). 

3. Pope, C. V. Ill, Ezzati, M. & Dockery, D. W. N. Engl. J. 

Med. 360, 376-386 (2009). 

4. Gauderman, W. J. et al. N. Engl. J. Med. 372, 

905-913 (2015). 

5. Baumgartner, J. et a/. Proc. Nat! Acad. Sci. USA 111, 

13229-1323 (2014). 

6. Wang, S. X. et al. Atmos. Chem. Phys. 14, 6571-6603 

(2014). 

7. Burnett, R. T. et al. Environ. Health Perspect. 122, 
397-403 (2014). 

8. Smith, K. R. et a/. Annu. Rev. Public Health 35, 
185-206 (2014). 

9. Thurston, G. D. et al. Environ. Health Perspect. (in the 
press). 

10.Kelly, F. J. & Fussell, J. C. Atmos. Environ. 60, 
504-526 (2012). 

11.Smith, K. R. et al. Lancet 374, 2091-2103 (2009). 

12.Hoek, G. et al. Environ. Health 28, 12(1):43 (2013). 

13.Apte, J. S., Marshall, J. D., Cohen, A. J. & Brauer, M. 
Environ. Sci. Technol. 49, 8057-8066 (2015). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 331 


© 2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature15257 


Labelling and optical erasure of synaptic 
memory traces in the motor cortex 


Akiko Hayashi-Takagi’, Sho Yagishita’’, 
Brian Kuhlman*°, Klaus M. Hahn®’” & Haruo Kasai’? 


Mayumi Nakamura’, Fukutoshi Shirai’, Yi I. Wu*, Amanda L. Loshbaugh*”®, 


Dendritic spines are the major loci of synaptic plasticity and are considered as possible structural correlates of memory. 
Nonetheless, systematic manipulation of specific subsets of spines in the cortex has been unattainable, and thus, the link 
between spines and memory has been correlational. We developed a novel synaptic optoprobe, AS-PaRacl (activated 
synapse targeting photoactivatable Racl), that can label recently potentiated spines specifically, and induce the selective 
shrinkage of AS-PaRacl-containing spines. In vivo imaging of AS-PaRacl revealed that a motor learning task induced 
substantial synaptic remodelling in a small subset of neurons. The acquired motor learning was disrupted by the optical 
shrinkage of the potentiated spines, whereas it was not affected by the identical manipulation of spines evoked by a 
distinct motor task in the same cortical region. Taken together, our results demonstrate that a newly acquired motor skill 
depends on the formation of a task-specific dense synaptic ensemble. 


Optogenetics is a powerful tool for controlling neuronal action poten- 
tials'’, and has been used to demonstrate the crucial role of cell 
assemblies in representing memory traces’. However, owing to the 
limitations of spatial resolution of probes currently available, manip- 
ulation of individual dendritic spines, the major sites of excitatory 
synapses* °, has been unfeasible, hindering the comprehensive under- 
standing of synaptic reorganization during learning. Thus, for spine- 
specific light control, we took advantage of the structural properties of 
spines: the tight correlation between spine volume and function*’ 
Because the prolonged activation of the small GTPase Racl induces 
spine shrinkage*"', we used a photoactivatable form of Racl 
(PaRacl)’” to induce spine shrinkage, which allowed us to control 
synaptic transmission with light. Moreover, since it has been sug- 
gested for a long time that the memory trace is allocated to specific 
neurons and spines of neurocircuits’*"*, here we targeted PaRacl to 
the activated synapses (activated synapse targeting PaRacl, AS- 
PaRacl) to establish a novel method, termed “synaptic optogenetics’, 
to visualize and manipulate the memory trace. 


AS-PaRacl labels the potentiated spines 


We first re-engineered the original PaRacl construct’* to optimize its 
properties for synaptic manipulation. Introduction of L514K and 
L531E mutations into the original construct markedly reduced the 
undesirable Racl background activity in the dark, as shown by 
isothermal titration calorimetry (ITC), the neuronal morphology, 
and co-immunoprecipitation (Extended Data Fig. la-c). Next, 
PaRacl was fused with a deletion mutant of PSD-95 (PSDA1.2)", 
which is known to concentrate at the postsynaptic site, but cannot 
bind with the major PDZ binding proteins, thus minimizing the 
undesirable effects of PSD-95 overexpression. An enrichment index, 
quantitative ratio of synaptic localization compared to that of the 
dendritic shaft (see Methods), supported the effective accumulation 
of PSD-PaRacl to the synapse, especially at the tip of the spine 
(Fig. 1a, construct B), where it was highly co-localized with the endo- 
genous PSD-95, but not with an axonal marker (Extended Data Fig. 


Enrichment Hot spot 


0 48 #O 4 8 
es 


a ATG Stop mRFP Venus 


A Venus) PaRac1 


Bal aa Fl Venus) PaRact 1 mm | |. 
ATG Stop a 
C4 Pspat.2 Paes |Wenuis| PaRact @DTE — et 
ATG Stop 
Stop =| 
om — rr 


b Uncaging alone Uncaging/FSK  Uncaging/FSK/Ani © ecg 


or 


-30 05 60(min) -30_05 60 -30 05 60 @ Uncaging/FSK/Ani 
fOMg 1M | fOMg_1Mg | foMig 1M | 3 150) 
> S100} 
Ss X 50 
[Ani _| > gh -COOe® SG 
Merge mRFP AS Merge mRFP AS _ Merge SE & 
-200 30 60 ae 
(min) 2 
d 600 - 
_~ 2004 
S 
| 400 © tool 
‘ cH 200 op) 
AM es : s ee 
| _—— 50S S$ 
-20 0 30 60 Se 
(min) ro 


e 11. 13 (DIV) 
!N4I AS expression 
Vehicle 
Lactacystin 

+ 

Trans- Imaging 

fection 


Enrichment eh spot 
0 4 8 


Pi pw 


we 


Figure 1 | Potentiation-dependent accumulation of AS-PaRacl to the 
dendritic spines in hippocampal slice cultures. a, Mapping for essential 
domains for the discrete distribution of the probe (arrowheads). Enrichment and 
hot spot index are plotted as arbitrary units. b, Representative images of single 
spine potentiations by glutamate uncaging (arrowheads) in the presence or 
absence of forskolin (FSK) and anisomycin (Ani). Mg, no Mg”*; 1Mg, 

1mM MgCl, ¢, d, Time courses of spine head volume (V, c) and AS-PaRacl 
accumulation (AS, d), both measured by fluorescence intensity. The mean change 
60 min after uncaging in the stimulated or neighbouring spines. e, The effect of 
lactacystin on the discrete accumulation of AS-PaRacl (arrowheads). DIV, days 
in vitro. Scale bars, 2 Lum. Error bars represent s.e.m. Detailed information on 
statistical methods/results are described in Extended Data Table 1. 


1Laboratory of Structural Physiology, Center for Disease Biology and Integrative Medicine, Faculty of Medicine, University of Tokyo, Bunkyo-ku, Tokyo 113-0033. @PRESTO, Japan Science and Technology 
Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan. °CREST, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan. “Center for Cell Analysis and Modeling, 
University of Connecticut Health Center, Farmington, Connecticut 06032, USA. °Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599, USA. ®Department of 
Biochemistry and Biophysics, University of North Carolina, Chapel Hill, North Carolina 27599, USA. Department of Pharmacology, University of North Carolina, Chapel Hill, North Carolina 27599, USA. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 333 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


1d). Finally, for neuronal input specificity, we exploited the dendritic 
targeting element (DTE) of Arc mRNA", which is selectively targeted 
and translated in activated dendritic segments in response to synaptic 
activation in an NMDA (N-methyl-p-aspartate) receptor-dependent 
manner’”"’. Interestingly, PSD-PaRacl-DTE sparsely labelled 
spines (Fig. la, construct C, arrowheads). Quantification using a 
hot spot index (see Methods), which indicates how unevenly 
PaRacl variants were distributed, suggested that both PSDA1.2 and 
DTE was necessary for this characteristic distribution (Fig. 1a, con- 
structs C and E). Therefore, the combination of PSDA1.2 and DTE 
was termed as “AS (activated synapse targeting) cassette’, and the 
PaRacl sequence flanked with the AS cassette was named AS- 
PaRacl (Fig. 1a, construct C). 

Next, we tried to unravel what this new synaptic probe labelled. 
Bicuculline, which increases neuronal excitation, robustly enhanced 
the number of AS-PaRacl-containing spines, and reduction of the hot 
spot index revealed that the distribution of AS-PaRacl became rela- 
tively uniform upon bicuculline treatment. In contrast, the blockage 
of action potential by tetrodotoxin (TTX) decreased the accumulation 
of the probe, resulting in a reduction in the spine enrichment index of 
the probe (Extended Data Fig. 2a—d). Because these findings suggested 
that synaptic activation regulates the localization of AS-PaRacl, we 
hypothesized that AS-Racl accumulates in recently potentiated 
spines. Indeed, when AS-PaRacl was co-transfected with SEP- 
GluA1, the synaptic incorporation marker for AMPA (a-amino-3- 
hydroxy-5-methyl-4-isoxazolepropionic acid) receptor subunits 
GluRI (refs 20, 21), the fluorescence signals of these two probes inside 
each spine were significantly correlated (Extended Data Fig. 2e, 
arrowheads). Furthermore, the protein synthesis-dependent poten- 
tiation during the single spine LTP protocol, which was elicited by 
glutamate uncaging and the adenylyl cyclase activator forskolin 
(FSK)”4, induced the accumulation of AS-PaRacl in the stimulated 
spines, while the protein synthesis-independent plasticity (glutamate 
uncaging alone) did not. Consistently, protein synthesis inhibitor 
anisomycin abolished the FSK-induced AS-PaRacl accumulation 
(Fig. 1b-d). No increase was observed in AS-PaRacl fluorescence in 
the neighbouring spines, indicating that AS-PaRacl accumulation 
was restricted to the stimulated spine (Fig. 1d). The DTE sequence 
was necessary for activity-dependent AS-PaRacl accumulation 
(Extended Data Fig. 2f, g), supporting that locally translated 
AS-PaRacl, unlike somatically translated AS-PaRacl, was preferen- 
tially recruited to enlarged spines. PaRacl did not exhibit uneven 
distribution unless the construct contained the PSD-95 domain 
(Fig. 1a, construct D). Because PSD-95 is rapidly degraded by protea- 
somes”, we examined the effect of the proteasome inhibitor lactacys- 
tin and found that it completely disrupted the unique distribution of 
the probe (Fig. le). Taken together, we concluded that AS-PaRacl is a 
probe that specifically labels the enlarged and newly generated spines 
(see Extended Data Fig. 3 for detailed cellular mechanisms), which are 
referred to as the ‘structurally potentiated spine’, and the potentiation 
labelled by AS-PaRacl is described as ‘potentiated spine’ hereafter. 


Spine labelling by AS-PaRacl in vivo 


To characterize this probe in vivo, we used the rotarod training as the 
model of motor learning. Because motor learning is impaired in Arc 
knockout mice*’, we assumed that the induction of AS-PaRacl by the 
Arc promoter” would enhance specific labelling during learning- 
induced potentiation. Arc::AS-PaRacl was delivered to the cortical 
layer II/III of the primary motor cortex (M1), where a robust reor- 
ganization of neuronal circuits is induced upon motor learning’. 
Cranial window surgery for two-photon imaging was performed 
based on the stereotaxic coordinates of the previous functional map- 
ping for the hind limb area*’. Spine volume and AS-PaRacl fluor- 
escence was compared quantitatively before and after training 
(Fig. 2a-e). Consistent with previous findings****, even in the train- 
ing-free period, a substantial number of spines ‘spontaneously’ under- 


334 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


went structural potentiation (formation or enlargement of spines; see 
the definition in Extended Data Fig. 4a), but the trained mice exhib- 
ited significantly more structural potentiation compared with the 
non-trained mice (Fig. 2d). Notably, synaptic fluorescence of AS- 
PaRacl just after training (0 day) strongly correlated with the change 
in spine size upon training (Fig. 2f). It is unlikely that the accumula- 
tion of AS-PaRacl caused the potentiation or labelled the spines 
primed for potentiation such as for the ‘tagged synapse’**”*, because 
the initial quantity of AS-PaRacl before learning (—1 day) did not 
correlate with the change in spine size after learning (Fig. 2g). Analysis 
of AS-PaRacl puncta in the dendritic shaft suggested that the majority 


-1d Od 
a e 
Ae Stop Fraction (%) 120 ica 
SARE |ArcMin, Waniis| PaRaci@ DTE 
AS cassette ius 
b £145 P30 P61~120 Eiaraes 


t t New 
indies I ry iy Saar a 

-1 day 
Bregma J, 


Not trained 
Enlarged 
New 


BB Rotarod training 
a 2-photon imaging 
Cranial window 
center coordinate: 
AP -1.1, ML +1.0 


ae 1 
DsRed/AS 


Schematic 


AV = 0.53 x AS +3.01 AV=-0.03 x AS + 10.0 


100° 0.61, P < 0.0001 +100 ls = -0.036, NS 
1 


Dendrite 
DsRed/AS 


A 


AV ((0 d) - (-1 d)) 


Schematic 


= es en es 3 
10 10 10 
Dendrite 3 AS (0 d) 
DsRed/AS 


h Proportion of AS (+) spine 
0 50 100 
Enlarged 69% 
New 83 % 
Shrunk 2% 
- Eliminated 0% 
i -1d Od +1id 42d No changed 1% 
. 24 hatter] 48h after}! 72h after) | 
; oe Ss J 600. 
Ae = 
£25 300 
Al SGOS , 
ger 0. 2 
5 O55%8 eae rage Ss 
£LoS FESS BSS BS B 
Potentiatedy... hae] GOPE BESS BES BE F 
fdeede ss. q 5 O2G2 Beeoe fee Ec = 
O2B2 SrHaeN ZG+H Gs F 
2x9 
Before Learning After-1 After-2 
k AS (41d): @>1 ec 
ie 2a after oo enlarged ooNew spine 


abt 


_100 78% eo 100 “e 
s 
i 50 @& 50 
n 
0 <= 0 


AAS (+) soma 

@ New AS (enlarged/new spine) is 
nN 

Persistent structural potentiation 
AS retention (+) oO 
£ 

< |, 


A pote srruciiral potentiation 


AS DUD comomeome) 
-OrN -OrNn 
Structural potentiation perished I ee I ++ 


Figure 2 | Spatiotemporal dynamics of AS-PaRacl labelling in vivo during 
the rotarod task. a, Schematic of the Arc promoter’’-driven AS-PaRacl. 

b, Experimental design. EP, electroporation; DsRed, Discosoma sp. red 
fluorescent protein. c, Images of spine formation (arrows) and spine 
enlargement (arrowheads). Green circles, AS-PaRacl; magenta circles, spines 
that initially acquired AP-PaRacl, but lost it afterward, but the structural 
change was persistent. d, Fraction of structural change of spines. 

e-g, Quantification of spine size and AS-PaRacl (e). Size measured by 
fluorescence intensity, arbitrary units (a.u.). Relationship between AS-PaRacl 
and volume change (AV) after (0 day, f) and before (—1 day, g) learning. 

h, Percentage of AS-PaRacl-containing spines (AS-PaRacl = 1 a.u., area 
shaded green in f). i, Mapping of AS-PaRacl. Potentiations before, just after, 1 
day after, and 2 day after learning are separately depicted as ‘Before’, ‘Learning’, 
‘After-1’, and ‘After-2, respectively. j, Retention of AS-PaRacl (green) or 
structural potentiation (magenta). k, Trajectory of spine size and AS-PaRacl 
intensities of the structurally potentiated spines. Scale bars, 2 |1m for ¢; 200 pm 
for b and i. Error bars represent s.e.m. NS, not significant. 


©2015 Macmillan Publishers Limited. All rights reserved 


of AS-PaRacl signal was located in the dendritic spines, and the 
labelling of shaft synapses was negligible (Extended Data Fig. 5). 
When we set the threshold of AS-PaRacl at 1 a.u. (Fig. 2f, green 
shaded area, 0 day), AS-PaRacl detected spine formation and enlarge- 
ment with sensitivities of 83 + 7.9% and 69 + 3.0% (mean + standard 
error of the mean), respectively (Fig. 2h), whereas labelling in other 
spine types was 2.3 + 0.12%. Since Arc::AS-PaRacl was induced only 
in the AS-PaRacl-positive neuron, the labelling properties in the AS- 
PaRacl-positive neuron were also calculated. The sensitivities for 
formation and enlargement were 94 + 2.7% and 95 + 4.8%, respect- 
ively, while false labelling in other spine types was 12.9 + 4.2% (306 
spines, 6 AS-PaRacl-positive neurons in 3 mice). Therefore, AS- 
PaRacl is a reliable marker of the potentiated spines in vivo. 

Next, we performed wide-view mapping of task-evoked potentia- 
tion using this probe (Fig. 2i, learning period), and we found that the 
task-evoked potentiation was elicited in 2.3 + 0.13% of spines and 
16.4 + 2.8% of neurons in the imaged area. We tracked an almost 
whole image of neurons (Extended Data Fig. 4b) and confirmed that 
when a spine was labelled by AS-PaRacl, its parental soma also 
expressed AS-PaRacl (6 AS-PaRacl-positive somata). Consistently, 
we could not find AS-PaRacl-positive spines in AS-PaRacl-negative 
soma (46 negative somata). Thus, the counting of AS-PaRacl puncta 
per AS-PaRacl-positive neurons could be approximated, which 
suggested that 14.7 + 2.01% of spines contained AS-PaRacl in the 
AS-PaRacl -positive neurons, implying that upon motor learning, a sub- 
stantial remodelling of spines (14.7%) was evoked in a small neuronal 
population (16.4%) in layer II/III (Extended Data Fig. 4d for detailed 
calculation). Similarly, a substantial remodelling was also observed in a 
small population of layer V neurons (Extended Data Fig. 4d). 

To characterize the synaptic retention of AS-PaRacl for photoac- 
tivation experiments in vivo, the individual spines that acquired 
AS-PaRacl were tracked, and were separately schematized from the 
day of AS-PaRacl appearance (Fig. 2c, i). We noticed that persistence 
of synaptic AS-PaRacl and the structural potentiation markedly var- 
ied among spines: some were preserved beyond 1 day after training 
(Fig. 2c, dendrite no. 1), while others disappeared (Fig. 2c, dendrites 
no. 2 and 3). Importantly, the structural potentiation and AS-PaRacl 
labelling triggered during the ‘learning’ period were more likely to be 
preserved than those triggered during the training-free period 
(Fig. 2j, ‘Before’ and ‘After-1 (potentiation 1 day after learning)’). 
Consistently, longitudinal imaging of the structurally potentiated 
spines revealed that the majority of those retaining AS-PaRacl for 
24h maintained structural potentiation for at least 48 h (Fig. 2k, green 
trace), whereas the structurally potentiated spines lacking AS-PaRacl 
retention returned to the pre-potentiated state (Fig. 2k, black trace). 
Such AS-PaRacl retention might be maintained by reverberation of 
learning-activated neuronal circuits, because AS-PaRacl was only 
expressed in Arc-expressing neurons, in which the persistent activa- 
tion helps to maintain plastic changes in the neocortex’**”. 


Selective spine shrinkage by AS-PaRacl 


Consistent with the previous findings that prolonged Racl activation 
induces spine shrinkage*""', we found that low-frequency photoacti- 
vation elicited spine shrinkage (Extended Data Fig. 6). Intriguingly, 
the spine shrinkage was significantly more robust when the 
AS-PaRacl construct was driven by the Arc promoter compared with 
the constitutive promoter CAG. Arc expression is increased by 
persistent neuronal activity”®, which induces the chronic activation 
of endogenous Racl, possibly contributing to the robust spine shrink- 
age by Arc::AS-PaRacl. Photoactivation-induced spine shrinkage was 
Racl-dependent, because deletion of Racl from AS-PaRacl while 
keeping other domains intact within AS-PaRacl (Arc::PSDA1.2- 
LOV-DTE) completely disrupted the shrinkage effect (Extended 
Data Fig. 6). To achieve spine shrinkage in a large cortical area in 
vivo, bilateral optical fibres were placed onto the cranial window 
(Fig. 3a and Extended Data Fig. 7). Low-frequency pulsed photoacti- 


ARTICLE 


vation triggered shrinkage specifically in the AS-PaRacl-containing 
spines (Fig. 3b, c). The effect of photoactivation was comparable at 
least within 100 1m from the dura, suggesting that spines in layer I, at 
least, were affected by photoactivation (Fig. 3d). Photoactivation- 
induced spine shrinkage was accompanied by functional depotentia- 
tion, which was demonstrated by the excitatory postsynaptic calcium 
transients: the extent of spine shrinkage correlated with the decrease 
in amplitude, but not with the decrease in frequency (Fig. 3f-j). Spine 
shrinkage and the subsequent functional changes were spine-specific, 
but not branch- or cell-wide, because spine shrinkage was not trig- 
gered in neighbouring AS-PaRacl-negative spines, and the calcium 
transient was not affected either in the neighbouring spines or in the 
soma (Fig. 3f-j; Extended Data Figs 6 and 7b). 


Optical erasure of acquired skills 

To demonstrate the effect of spine shrinkage for learning, mice were 
bilaterally injected with the adeno-associated virus (AAV) 5 that 
encompassed layers I to V (Extended Data Fig. 7f). Mice were 
divided into two groups: animals in the first group were transfected 
with monomeric red fluorescent protein (mRFP) alone as a control, 
and the second group was transfected with AS-PaRacl and mRFP. 
Both groups exhibited significantly better motor performance after 
training, but only the performance of the AS-PaRacl group was 
disturbed by photoactivation (protocol 1, Fig. 4a, b), and the extent 
of learning disruption induced by photoactivation (photoactivation 


Before PA 


PA 
After PA 
DsRed 


Fibre DsRed 


Cement 
Headgear 


| ee 


AS (-) AS (+) 
spine spine 
d 100 etteeeeneseeneeee “€ No change 


<x 200 =-0.17 
ke NS 


0 50 100 
Depth from dura (um) 


paste PAI UponPA J Upon PA 


| of -) spin As -) spin 
S (4 yepine, 0 AS( ) spine, 


e f 
mRFP/GCaMP 


Before PA 
Merge GCaMP AS-mTq mRFP 


After PA 


ens 


2 0 Before PA on =Dendrite 5 % 3. 
ones ae 0.58L-100 ©. = -0.15!2100 
& 2 rs 
= 400 G av(%) & av(%) 
to} 
_ _ nseSoma ys 
2 30 Time(s) & 80 g & 80. 
ee Before PA 300) After PA -Soma 2 *° zB 407° 
a cra Bi: 
Lo 0 £ le = ~40 
3 0 20 40 60 Oo 20 40 60 Time (s) $s -80. i ~80 @ 
<i 


Figure 3 | Selective shrinkage of AS-PaRacl-containing spines upon 
photoactivation (PA). a, Illustration of photoactivation. b, Images of the 
hind limb regions of cortices. SARE::AS-PaRacl and CAG::mRFP were 
transduced by in utero EP (a-d). c, Spine size following photoactivation. Dark 
green circles are eliminated spines. d, The effect of cortical depth on 
photoactivation-induced spine shrinkage. e-j, Hippocampal neuron cultures 
were co-transfected with GCaMP6s, AS-PaRacl-mTurquoise, and mRFP. 
Changes in the GCaMP/mRFP ratio (AR) in synapse (g) and soma (h) were 
traced. i, j, Relationships between AV and AAmplitude (i), or AFrequency 

(j) upon photoactivation. Circles 1, 2, and S correspond to spine no. 1, no. 2, and 
the soma in g and h. Scale bars, 5 um for b; 50 um for e, 2 um for f. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 335 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


effect) negatively correlated with the extent of training-evoked 
improvement (learning attainment) (Fig. 4f). In contrast, there 
was neither disruption of acquired learning nor a correlation 
between the effects of the training and photoactivation in the control 
group. Since photoactivation did not affect the running speed of the 
identical cohort used in Fig. 4b, it is unlikely that photoactivation 
disturbed the general motor performance (Extended Data Fig. 8). 

Photoactivation disrupted the acquired learning even 1 day after 
learning (protocol 2; Fig. 4c, g), when the majority of learning-evoked 
spines contained AS-PaRacl (Fig. 2k). In contrast, photoactivation 
treatment 2 days after learning (protocol 3), when both the number of 
AS-PaRacl-containing spines and the intensity of AS-PaRacl label- 
ling were markedly decreased (Fig. 2k; Extended Data Fig. 4), failed to 
disrupt acquired learning (Fig. 4d, h). Owing to daily spontaneous 
potentiation, a comparable number of spines contained AS-PaRacl in 
both protocol 2 and protocol 3 (Extended Data Fig. 4c). Nonetheless, 
only protocol 2 disrupted the acquired skill, suggesting that the learn- 
ing-evoked spine potentiation visualized by AS-PaRacl (at +1 day), 
but not spontaneous potentiation (at +2 day), accounted for the 
cortical memory traces. 

To demonstrate the task-specific role of synaptic ensembles, mice 
injected with AS-PaRacl-expressing AAV into the bilateral M1 were 
subjected to a dual task protocol. Mice sequentially learned two dis- 
tinct hind limb tasks: the rotarod and the beam tasks in the first and 
second sets of 2 days, respectively (Fig. 4i). We performed the photo- 
activation on day 4, because the majority of the rotarod-evoked AS- 
PaRacl puncta diminished by this time point (Fig. 2k). We confirmed 
that these two tasks evoked a comparable number of spine potentia- 
tion (Extended Data Fig. 7c). While learning performance in the beam 
task was not disrupted by the sham photoactivation treatment (fibre 
was inserted, but no illumination was performed), photoactivation 
disrupted the acquired performance in the beam task, without affect- 
ing the rotarod performance (Fig. 4j). We found no correlation 
between the effect of photoactivation in the rotarod and the beam 
task, which implies that synaptic ensembles recruited by each task did 
not overlap (Fig. 4k). 


Task-specific synaptic ensemble 


To visualize the synaptic ensembles formed during dual task learning, 
mice were sparsely labelled with AS-PaRacl, and were also subjected 
to the dual task protocol described before (Fig. 5a, dual task). 
AS-PaRacl puncta were classified on the basis of time of emergence 
(Fig. 5b), schematized for the rotarod task potentiation (day 2 spe- 
cific) as blue dots, for the beam task potentiation (day 4 specific) as 
yellow dots, and for the continuous potentiation for both periods 
(both day 2 and 4) as green dots. Interestingly, more than half of 
the beam-evoked potentiation were new ones (Fig. 5n), which were 
not potentiated previously in the rotarod task (yellow, Fig. 5c-e). 
Taken together with the behavioural data (Fig. 4i-k), we have demon- 
strated that the two learning tasks induced the potentiation of distinct 
synaptic ensembles. 

Finally, we examined whether the same spines are potentiated by 
the same task. Mice were divided into 2 groups (Fig. 5a). The first 
group was subjected to the rotarod task in the first 2 days, which was 
followed by the shrinkage of the learning-evoked potentiation by 
photoactivation, and then the identical rotarod task was re-trained 
(re-training condition). The second group was subjected to the 
rotarod task and subsequent photoactivation, and mice were not 
trained for another 2 days (home cage condition). We found that 
the majority of the optically shrunk spines returned to their prev- 
iously potentiated size after re-training, while the degree of re- 
potentiation was significantly lower in the home cage group, sug- 
gesting that re-training induced the re-potentiation of the same 
subset of spines (Extended Data Fig. 7d, e). Mice assigned to the 
dual task protocol were also compared, highlighting the difference 
in the potentiation patterns among the groups during the last 2 days 


336 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


a Protocol 1 

P06 ; P60~100, Day 1 Peed 2 Day 3 0 day PA 
Re eunel ee 

nto V tex su y ‘ta g x Protocol 2 

Day 1 Day 3 Day 4 + 1 day PA 

E\ al a 
I Rotarod training Py y 

PA Af Protocol 3 

Rotarod test 4 : +2 day PA 

Day 1 Day 3 Day 5 


b e 
Protocol 1 Protocol 2 Protocol 3 — 50 rs 
CK::mRFP CK::mRFP CK::mRFP CK::mRFP < ay 
Arc::AS-PaRac1 Arc::AS-PaRac1 Arc::AS-PaRac1 & rc 
foe a * hie o 
— eal el ——1 5 
xxx NS tekek eek exe NS we xxx NS NS a4 ol 
mr fy lina oe mo4-4 = 
300 300 300 300 Te 
= z 
g a 50 
o 200 200 200 200 0 
> & 
2] = 
2 2-100 
g 100 oss 100 aos 100 osos 100 poss 8 ae 66 
=o, ore cory een, 5 B88 
tBe BE faye 6 Bye E888 
1 7 7 Er 7 ff 000 
oo on oo 8 oo 8 eee 
a a a ao ee 
f Protocol 1 g Protocol 2 h_ ‘Protocol 3 
CK::mRFP CK::mRFP CK::mRFP CK::mRFP 
alone Arc::AS-PaRac1 Arc::AS-PaRac1 Arc::AS-PaRac1 
= r,=0.11 r,=-0.52 r,=-0.81 r= 0.03 
G1 OB 100) 5P=075 100) p=0.047 100) P= 0.002 100) P=0.96 
gfe 
8 Fe 3 5 @150.250 : a 50 250 P 50 150 250 7 ? 250 
5 Se © eo e 
e O_ a, = e = 
aes 100: 100 °, e 100 ogo 100: 
(Post (0 d) — Pre) (Post (+ 1 d) - Pre) (Post (+ 2 d) — Pre) 
Learning attainment by training (s) 
P28 : P60~80 ie Day1 Day2 Day3 Day4 Non-PA 
+ + Ft t ti 
gery Pre- P After sham PA 
= training tra train tre (rota)(beam) 
m™ Rotarod training (rota) (rota) (beam) (beam) (4 PA group 
m@ Beam training (rota) 
4 Performance test 4 
Non-PA PA group After PA 
cu 7 _ ee ee (rota)(beam) 
Tal ve Teaceend ices i k 
400, *** NS NS 14 *** NS A004 xxx NSNS @® 14 xxx * 200, - ~-9.53 
mr mr I-71 | mr $ ‘ 


(s) 


Rotarod, latency to fall (S) wm 
8 
Oo 
: 
8 
yk 


s mM 2 
8 2 9 —— 
fe) > Oo on 
6 10: 200 2 10 Sy 0. 
2 Better 2 © ge 
& E 56 
© a) g 
ot 2 £ 5 -100 
0 Worse 490. Worse 6 9g E100: oa 
eoveg € Poe GF PSTE gy LE ~200 
eeve a tal £°oNs m gos 
Sais a Sas Sgt Sg PA effect on 
oa Be oa bag ba ~ beam (s) 
a as a & ao « a 
<x <x 


Figure 4 | Erasure of acquired learning by the photoactivation of spines 
labelled with AS-PaRacl. a, Experimental design (see Extended Data Fig. 9). 
b-d, Mice, which received AAV infection of SARE::AS-PaRacl, were allocated 
to protocols 1 (b), 2 (c), or 3 (d). An average of three trials of each mouse 
was used as the task performance (grey line). e, The critical period of 
photoactivation to erase acquired skills. f-h, Relationship between the effect 
of photoactivation and learning attainments. i, Experimental design. 

j, Performance trajectory of each skill. k, No correlation between 
photoactivation effect on acquired rotarod performance and that of beam 
task. Error bars represent s.e.m. 


(Fig. 5c-n). Contrary to the dual task group, spines potentiated 
during the first rotarod training were more likely to be re-poten- 
tiated after the second rotarod training in the re-training group 
(green, Fig. 5f-h, 1, n), while re-potentiation was significantly less 
prominent in mice that did not perform the re-training task (home 
cage group) (Fig. 5i-k, 1, n; Extended Data Fig. 7d, e). Furthermore, 
newly potentiated spines, which were not potentiated in the first 
2 days, were less abundant in the re-training and home cage groups 
compared with the dual task group (yellow, Fig. 5m, n). These 
findings suggest that reorganization of distinct synaptic ensembles 
is specific for each learning task. 


©2015 Macmillan Publishers Limited. All rights reserved 


_ Dual tasks b pDay1 


y,,Pre-_ After After beam 
training rota 


Retraining 
= Rotarod training 
= Beam training 

A 2-photon imaging 


PPA 


¢ Dual tasks 


After PA After 2nd 


Schematic 


Home cage 


AS value 
Day 2 Day 4 


Classification 


of spines 


After PA 2 days later 


d 02468100% Day 2 specificM 21 )<1 
Both day 2/4 MM 21 21 
_— Day 4 specific <4) 21 
AS(-)spine Mi <1 | <1 


FT 


Day 2 (after rota) Day 4 (after beam) 
coco = = 


© 500 = 
d i 
50: 
> Both day 2/4 
0: I 
rat + o41 
>> > 
Adda & Dual tasks 
02468 100% , Retraining if 
i Home cage 
PA gin 


m_ Day 4 specific 


Vessel 


f Retraining 


Day 2 (after rota) Day 4 (after 2nd rot 


+, 


01 2 3(%) 
Dual tasks 
= Retraining | 
Home cage 


i Home cage Potentiated spines 
in day 4 


0 50  100(%) 


Dual tasks 
] aes 


8 100% 
Day 2 (after rota) i 


Retraining 


| NS 
Home cage 


Previously 
potentiated 


Newly 
potentiated 


Day 4 


Figure 5 | Visualization of synaptic ensembles for distinct learning tasks. 
a, Experimental design. Arc::AS-PaRacl and CAG::mRFP (filler) was 
transduced by in utero electroporation. b, Images of dendrites upon learning. 
AS-PaRacl puncta are colour-coded based on its appearance and duration. 
Identical colour codes are used in c-n. ¢, f, i, Wide view mapping of AS-PaRacl. 
d, g, j, The fraction of each spine type. e, h, k, The trajectory of spine size 
(V).1, m, Differential spine potentiation in each condition. n, The proportions 
of newly potentiated spines. Scale bars, 2 11m for b; 50 jm for ¢, f and i. Error 
bars represent s.e.m. 


Discussion 


Current models of learning and memory suggest that structural plas- 
ticity of spines is the underlying mechanism of information storage in 
the brain. Nonetheless, clear visualization of spine structure in vivo 
requires the sparse labelling of neurons, and analysis of structural 
changes in spines is very laborious. In contrast, the 
AS-PaRacl signal appears as fluorescence puncta, which allows the 
detection of potentiated spines far more easily, even at high transfec- 
tion condition. Moreover, the role of potentiated spines can be 
directly assessed with photoactivation during behavioural examina- 
tions. In this study, we showed that photoactivation of the bilateral 
M1 cortex disrupted the acquired motor skill. We estimated the num- 
ber of learning-evoked neurons affected by photoactivation was 
approximately 4,700 neurons based on the following calculation: (a) 
X (b) X (c) X (d) X (e), in which (a) represents the density of neurons 
in the neocortex, 9.2 X 10*/mm; (ref 38); (b) the photoactivated area, 
fibre core diameter = 500 im, 0.4 mm7/bilateral; (c) the thickness of 
cortical layers (II-V) that were infected with AAV, 0.8 mm; (d) AAV 
infection efficiency, 80% (Extended Data Fig. 7f); (e) the percentage of 
AS-PaRacl-positive neurons upon learning, 20% (Extended Data Fig. 
4d). On the other hand, due to the limitations of light transmission, 
the majority of the shrunk spines resided in layer I (up to 100 um from 
the dura). The minimal number of learning-evoked spines illumi- 
nated by the optical fibre was roughly 410,000 spines in the bilateral 
M1 cortex based on the following calculation: (d) x (f) X (g) X (h), in 
which (f) represents the density of excitatory synapses in the mouse 
neocortex, 6.4 X 10°/mm? (ref 38); (g) learning-evoked potentiation, 


ARTICLE 


approximately 2% of the spines in this area (Extended Data Fig. 4); 
(h) brain volume that received photoactivation: 0.4 mm” of photo- 
activation area X 0.1mm of depth = 0.04 mm7’). In the layer I, 
corticocortical feedback projections mediating top-down influences 
are concentrated, which strongly excite a subpopulation of pyr- 
amidal neurons”. Learning-evoked changes in neuronal ensembles 
via the synaptic reorganization of the M1 cortex directly predict 
future task performance*’. As nonlinear information integration 
primarily occurs in the tuft of dendrites in behaving animals*’, and 
activation of several spines in the tuft is sufficient to initiate NUDA 
spikes for action potential generation’. Thus, the shrinkage of 
potentiated spines in our study (410,000 spines in the dendritic tufts 
of 4,700 neurons) would be reasonably expected to disrupt the learn- 
ing-evoked substantial remodelling in a specific neuronal popu- 
lation. Formation of the dense connections in a small neuronal 
ensemble may be consistent with the formation of functional neur- 
onal clusters in the motor cortex after learning*’. Thus, synaptic 
optogenetics might be a powerful tool to uncover the mechanism 
of synaptic plasticity and its relationships with subsequent beha- 
vioural manifestations. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 6 May 2014; accepted 3 August 2015. 
Published online 9 September 2015. 


1. Bernstein, J.G. & Boyden, E. S. Optogenetic tools for analyzing the neural circuits of 
behavior. Trends Cogn. Sci. 15, 592-600 (2011). 

2. Tye, K.M. & Deisseroth, K. Optogenetic investigation of neural circuits underlying 
brain disease in animal models. Nature Rev. Neurosci. 13, 251-266 (2012). 

3. Liu, X. et al. Optogenetic stimulation of a hippocampal engram activates fear 
memory recall. Nature 484, 381-385 (2012). 

4. Holtmaat, A. & Svoboda, K. Experience-dependent structural synaptic plasticity in 
the mammalian brain. Nature Rev. Neurosci. 10, 647-658 (2009). 

5. Kasai, H., Fukuda, M., Watanabe, S., Hayashi-Takagi, A. & Noguchi, J. Structural 
dynamics of dendritic spines in memory and cognition. Trends Neurosci. 33, 
121-129 (2010). 

6. Yuste, R. Dendritic spines and distributed circuits. Neuron 71, 772-781 (2011). 

7. Murakoshi, H. & Yasuda, R. Postsynaptic signaling during plasticity of dendritic 
spines. Trends Neurosci. 35, 135-143 (2012). 

8. Luo, L. etal. Differential effects of the Rac GTPase on Purkinje cell axons and 
dendritic trunks and spines. Nature 379, 837-840 (1996). 

9. Tashiro, A., Minden, A. & Yuste, R. Regulation of dendritic spine morphology by the 
rho family of small GTPases: antagonistic roles of Rac and Rho. Cereb. Cortex 10, 
927-938 (2000). 

10. Hayashi-Takagi, A. et a/. Disrupted-in-Schizophrenia 1 (DISC1) regulates spines of 
the glutamate synapse via Racl. Nature Neurosci. 13, 327-332 (2010). 

11. Hayashi-Takagi, A. et al. PAKs inhibitors ameliorate schizophrenia-associated 
dendritic spine deterioration in vitro and in vivo during late adolescence. Proc. Nat! 
Acad. Sci. USA 111, 6461-6466 (2014). 

12. Wu, Y. |. etal. A genetically encoded photoactivatable Rac controls the motility of 
living cells. Nature 461, 104-108 (2009). 

13. Redondo, R. L. & Morris, R. G. Making memories last: the synaptic tagging and 
capture hypothesis. Nature Rev. Neurosci. 12, 17-30 (2011). 

14. Rogerson, T. et a/. Synaptic tagging during memory allocation. Nature Rev. 
Neurosci. 15, 157-169 (2014). 

15. Arnold, D. B. & Clapham, D. E. Molecular determinants for subcellular localization 
of PSD-95 with an interacting KT channel. Neuron 23, 149-157 (1999). 

16. Kobayashi, H., Yamamoto, S., Maruo, T.& Murakami, F. Identification of a cis-acting 

element required for dendritic targeting of activity-regulated cytoskeleton- 

associated protein mRNA. Eur. J. Neurosci. 22, 2977-2984 (2005). 

17. Steward, O., Wallace, C. S., Lyford, G. L. & Worley, P. F. Synaptic activation causes 

the mRNA for the IEG Arc to localize selectively near activated postsynaptic sites on 

dendrites. Neuron 21, 741-751 (1998). 

18. Steward, O. & Worley, P. F. Selective targeting of newly synthesized Arc mRNA to 

active synapses requires NMDA receptor activation. Neuron 30, 227-240 (2001). 

19. Korb, E. & Finkbeiner, S. Arc in synaptic plasticity: from gene to behavior. Trends 

Neurosci. 34, 591-598 (2011). 

20. Makino, H. & Malinow, R. Compartmentalized versus global synaptic plasticity on 

dendrites controlled by experience. Neuron 72, 1001-1011 (2011). 

21. Zhang, Y., Cudmore, R. H., Lin, D. T., Linden, D. J. & Huganir, R. L. Visualization of 

MDA receptor-dependent AMPA receptor synaptic plasticity in vivo. Nature 
Neurosci. 18, 402-407 (2015). 

22. Govindarajan, A,, Israely, |., Huang, S. Y. & Tonegawa, S. The dendritic branch is the 
preferred integrative unit for protein synthesis-dependent LTP. Neuron 69, 
132-146 (2011). 

23. Tanaka, J. et al. Protein synthesis and neurotrophin-dependent structural 
plasticity of single dendritic spines. Science 319, 1683-1687 (2008). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 337 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


24. 
25. 
26. 


27. 


28. 
29. 
30. 
31. 
32. 
33. 
34. 
35. 
36. 
37. 
38. 


Harvey, C. D. & Svoboda, K. Locally dynamic synaptic learning rules in pyramidal 
neuron dendrites. Nature 450, 1195-1200 (2007). 

Colledge, M. et a/. Ubiquitination regulates PSD-95 degradation and AMPA 
receptor surface expression. Neuron 40, 595-607 (2003). 

Ren, M., Cao, V., Ye, Y., Manji, H. K. & Wang, K. H. Arc regulates experience- 
dependent persistent firing patterns in frontal cortex. J. Neurosci. 34, 6583-6595 
(2014). 

Kawashima, T. et a/. Synaptic activity-responsive element in the Arc/Arg3.1 
promoter essential for synapse-to-nucleus signaling in activated neurons. Proc. 
Natl Acad. Sci. USA 106, 316-321 (2009). 

Yu, X. & Zuo, Y. Spine plasticity in the motor cortex. Curr. Opin. Neurobiol. 21, 
169-174 (2011). 

Masamizu, Y. et a/. Two distinct layer-specific dynamics of cortical ensembles 
during learning of a motor task. Nature Neurosci. 17, 987-994 (2014). 

Peters, A. J., Chen, S.X.& Komiyama, T. Emergence of reproducible spatiotemporal 
activity during motor learning. Nature 510, 263-267 (2014). 

Cichon, J. & Gan, W. B. Branch-specific dendritic Ca** spikes cause persistent 
synaptic plasticity. Nature 520, 180-185 (2015). 

Hira, R. et al. Transcranial optogenetic stimulation for functional mapping of the 
motor cortex. J. Neurosci. Methods 179, 258-263 (2009). 

Yang, G., Pan, F. & Gan, W. B. Stably maintained dendritic spines are associated 
with lifelong memories. Nature 462, 920-924 (2009). 

Fu, M., Yu, X., Lu, J. & Zuo, Y. Repetitive motor learning induces coordinated 
formation of clustered dendritic spines in vivo. Nature 483, 92-95 (2012). 

Frey, U. & Morris, R. G. Synaptic tagging and long-term potentiation. Nature 385, 
533-536 (1997). 

Okada, D., Ozawa, F. & Inokuchi, K. Input-specific spine entry of soma-derived Vesl- 
1S protein conforms to synaptic tagging. Science 324, 904-909 (2009). 

Wang, K. H. et a/. In vivo two-photon imaging reveals a role of Arc in enhancing 
orientation specificity in visual cortex. Cel! 126, 389-402 (2006). 

Schtiz, A. & Palm, G. Density of neurons and synapses in the cerebral cortex of the 
mouse. J. Comp. Neurol. 286, 442-455 (1989). 


338 | NATURE | VOL 525 | 17 SEPTEMBER 2015 
©2015 Macmillan Publishers Limited. All rights reserved 


39. 
40. 


41. 
42. 


43. 


Acknowledgements We 


pro 
ana 


Ministry of Education, Culture, Sports, Science, and Technology (MEXT, Japan; No. 
2000009 to H.K. and No. 26221011 to H.K. and A.H.-T., No. 23689055 and No. 


Cauller, L. Layer | of primary sensory neocortex: where top-down converges upon 
bottom-up. Behav. Brain Res. 71, 163-170 (1995). 

Laubach, M., Wessberg, J. & Nicolelis, M. A. Cortical ensemble activity increasingly 
predicts behaviour outcomes during learning of a motor task. Nature 405, 
567-571 (2000). 

Xu, N. L. eta/, Nonlinear dendritic integration of sensory and motor input during an 
active sensing task. Nature 492, 247-251 (2012). 

Larkum, M.E., Nevian, T., Sandler, M., Polsky, A. & Schiller, J. Synaptic integration in 
tuft dendrites of layer 5 pyramidal neurons: a new unifying principle. Science 325, 
756-760 (2009). 

Hira, R. et al. Spatiotemporal dynamics of functional clusters of neurons in the 
mouse motor cortex during a voluntary movement. J. Neurosci. 33, 1377-1390 
(2013). 


hank H. Bito and H. Okuno for the generous gift of the Arc 
moter; F. Murakami for the information about Arc 3’ UTR; M. Yuzaki, K. Inokuchi, 
K. Fox for discussions. This research was supported by Grants-in-Aid from the 


24116003 to A.H.-T.), the PRESTO program (JST) to A.H-T., the brain/MIND and SICP 


pro 
Nat 


jects from Japan Agency for Medical Research and Development (AMED) to H.K., the 
ional Institutes of Health grant GM102924 to K.M.H., NSO71216 to Y.I.W. and the 


Research Grant from the Human Frontier Science Program to H.K., K.M.H. and B.K. 


Author Contributions A.H.-T,, S.Y., M.N., and F.S. conducted the experiments. Y.I.W., 


AL. 
AH 


L, K.M.H., and B.K. provided technical support for the development of PaRacl. 
-T. and H.K. designed the study and wrote the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to A.H.-T. 
(hayashi888@m.u-tokyo.ac.jp) or H.K. (hkasai@m.u-tokyo.ac.jp). 


METHODS 


Ethical considerations. The use and care of animals in this study followed the 
guidelines of the Animal Experimental Committee of the Faculty of Medicine at 
the University of Tokyo. 

Plasmid construction and transfection. Mutagenesis and deletion of cDNA 
were conducted based on previously described methods’. Briefly, L514K and 
L531E mutations of the LOV2 domain were introduced with the following pri- 
mers (mutations underlined): 5’-ctttattggggttcagaaggatggaactgagcatg-3’, 
5'-gagagagegagtcatggagattaagaaaactgcag-3’, and with their corresponding com- 
plementary primers. PSD-95(APDZ1.2) was generated by deleting the nucleo- 
tides (nts) 250 to 993 based on the numbering of NM_019621. The DTE sequence 
of Arc mRNA was cloned from the 1st strand cDNA generated from the frontal 
cortex of postnatal day 50 (P50) Sprague-Dawley rats with the following primers 
(HindIII underlined): 5’-atgataagctttcggctccatgactcagccatgcc-3' and 5’-atga- 
taagcttagacacgagcagttaccaacacg-3'. The generated amplicon, which corre- 
sponded to 2036-2699 nts based on the numbering of NM_019361, was 
subcloned immediately downstream of the stop codon of PaRacl. 

Isothermal titration calorimetry (ITC). ITC for examining the affinity of 
PaRacl to the CRIB domain of PAK1 in the lit and dark states was carried out 
as described previously”. 

PaRacl pull-down assay. PaRacl variants were transfected into HEK293 cells by 
lipofection (Lipofectamine 2000; Invitrogen, Carlsbad, CA), and the cells were 
divided into lit and dark groups. The cells in the lit group were illuminated with a 
white fluorescent lamp (1.5 W for a 10-cm dish, 19 + 1.0mW cm ”) for 10 min 
before cell lysis, and the subsequent immunoprecipitation was performed in 
continuous light illumination until the final wash step of protein precipitants. 
Cells in the dark group were manipulated under a yellow fluorescence lamp, 
which excluded light at the wavelengths below 500 nm to avoid photoactivation. 
Cells were lysed in a lysis buffer (150 mM NaCl, 50 mM Tris*HCl pH 7.5, 1% 
Triton-X (v/v), 10 mM NaF, 10% glycerol (v/v), 1 mM EDTA, and protein inhib- 
itor cocktail (Complete; Roche Diagnostics)). Lysates were sonicated intermit- 
tently on the mixture of ice and water, and cell debris was cleared by 
centrifugation. The soluble fraction was incubated with an anti-GFP antibody 
(D253-3; MBL, Nagoya, Japan), followed by co-precipitation with Protein G 
Sepharose (GE Healthcare, Little Chalfont, UK). The precipitate was immuno- 
blotted with an anti-PAK1 antibody (no. 2602; Cell Signaling, Beverly, MA). 
Signal intensity of each band (net signal after subtracting the background signal, 
which was obtained from the region adjacent to the band) was measured using the 
Image] software (National Institutes of Health, Bethesda, MD). 
Immunofluorescence. Cell staining was performed as described previously’®. 
Briefly, dissociated rat cortical neurons at 21 days in vitro (DIV) were fixed with 
4% paraformaldehyde (PFA) for 30 min at room temperature. Mice were eutha- 
nized after the behavioural analyses, and their brains were perfusion-fixed with 
4% PFA and sectioned coronally to obtain 150-1m thick sections. Fixed samples 
were then permeabilized with Perm/Blocking buffer (2.5% normal goat serum (v/ 
v) in phosphate-buffered saline [PBS] with 0.3% Triton X-100 (v/v)) for 1h at 
room temperature. Samples were incubated for 24h at 4°C with the following 
primary antibodies: anti-phospho-neurofilament (SMI-31; Merck KGaA, 
Darmstadt, Germany), axonal marker; anti-PSD-95 (6G6; Abcam, Cambridge, 
UK); anti-Emx1 (sc-28220; Santa Cruz, CA) for the staining of pyramidal neu- 
rons. After rinsing with PBS (3 times, 5 min each), sections were stained with the 
corresponding secondary antibodies, followed by mounting. Cell labelling was 
examined with a confocal microscope (LSM510 META NLO; Carl Zeiss, 
Oberkochen, Germany). 

Hippocampal slice culture and transfection. Hippocampal slices (350-pjm 
thick) were dissected from Sprague-Dawley rats at P7 by a vibratome 
(VT1200S; Leica, Wetzlar, Germany), mounted onto 0.4-~m Millicell culture 
inserts (EMD Millipore, Billerica, MA). At DIV 11, slices were transfected bio- 
listically by a PDS1000/He Biolistic Gene Gun (Bio-Rad, Hercules, CA) with 1.6- 
uum gold microcarriers. At 2 to 4 days after transfection, cultures were transferred 
to the recording chambers and constantly perfused with oxygenated artificial 
cerebrospinal fluid (ACSF, 95% O, and 5% CO.) containing 125mM NaCl, 
2.5mM KCI, 2mM CaCl, 1mM MgCh, 1.25mM NaH,PO,, 26mM NaHCOs, 
20 mM glucose, and 200 11M Trolox (Sigma-Aldrich, St. Louis, MO) at 29-30 °C. 
In some experiments, we added tetrodotoxin (Wako, Osaka, Japan, 1 1M), bicu- 
culline methiodide (Sigma-Aldrich, 12 11M), lactacystin (EMD Millipore, 10 1M) 
to culture and the recording medium. 

In utero electroporation. This procedure was performed according to the pub- 
lished protocol with minor modifications''. Briefly, pregnant C57BL/6 mice were 
anaesthetized at embryonic day 13 (E13) or 14.5 (E14.5) with isoflurane, and 
AS-PaRacl-Venus and filler constructs (2 ug each) were injected unilaterally into 
the ventricle. Electrode pulses (electrodes: ~ (diameter) 3 mm for E13 and@ 5mm 


ARTICLE 


for E14.5, 33 V, 50 ms pulse length, 950 ms pulse interval, 4 pulses) were charged 
unilaterally for the targeting to the M1 cortex. 

AAV viral production. AAV viral production was performed with the AAV 
helper-free system (Agilent Technologies, Santa Clara, CA). The pRep-Cap 
(AAV5; Applied Viromics, Fremont, CA) and the pHelper plasmid were co- 
transfected into the AAV-293 cells with polyethylenimine ‘Max’ (Polysciences, 
Warrington, PA). After 72-h-long incubation, cells were harvested and lysed 
with five freeze-thaw cycles. The resultant supernatants were overlaid on 40% 
sucrose solution containing 100mM Tris-HCl (pH 8.0), 150mM NaCl, and 
0.01% BSA (v/v), and were centrifuged at 100,000g for 16h at 4°C. The pellet 
(crude viral particles) was treated with 1,000 U benzonase nuclease (Novagen, 
Madison, WI) for 1h at 37°C. After filtering through a 5-\1m syringe filter to 
remove debris, the filtered material was subjected to CsCl gradient centrifu- 
gation (1.25gml~’ and 1.50g ml) at 257,300g for 48h at 15°C. The virus- 
rich fraction was restored, and the solvent was replaced with ASCF 
(1mM MgCl,, 10mM HEPES, CaCl)-free). Virus titre was determined with 
quantitative real-time PCR analysis (SYBR Green; Takara Bio Inc., Shiga, 
Japan). 

Virus injection and open-skull cranial window surgery. Adult male C57BL/6 
mice were anaesthetized with isoflurane, and mannitol (4 1g per g of body weight) 
and dexamethasone (7 1g per g of body weight) was administered intraperitone- 
ally to prevent brain swelling. Subcutaneous injections of ketoprofen (40 ug per g 
body weight) and penicillin/streptomycin (4 U per g body weight) were adminis- 
tered for 4 consecutive days beginning 1 day before the operation to prevent 
inflammation. The skull was exposed over the M1 cortex based on stereotactic 
coordinates. Then, 1 pl of AAV (0.5 to 8.0 x 107% genome copies ml!) was 
injected in the M1 cortex using a glass pipette (tip diameter 30 jm, bevelled at 
an angle of 45°) at a rate of 150nlmin ' using a syringe pump (Legato130; 
Muromachi Kikai, Tokyo, Japan). The location of the injection site was standar- 
dized among animals by using stereotaxic coordinates (AP = —0.8; ML = +1.0; 
DV = +0.5) from the skull. At the end of the injection, we waited 5 min before 
retracting the pipette. Stainless steel trephines (@ 2.7 mm; Fine Science Tools, 
Foster City, CA) were used to generate a circular open skull window. To avoid 
brain damage, intermittent drilling was performed at a speed of 10,000 r.p.m. with 
a continuous gentle perfusion of oxygenated ACSF, and we tried to avoid apply- 
ing excessive drilling pressure on the skull as much as possible. If we detected no 
bleeding, the drilled hole was covered with a circular coverslip (@ 2.7 mm, 
<0.1mm thickness, Matsunami Glass, Kishiwada, Japan) and sealed with dental 
cement (Fuji Lute BC; GC, Tokyo, Japan), which was followed by the attachment 
of the headgear for in vivo imaging. 

Two-photon imaging, glutamate uncaging, and photoactivation. Two-photon 
imaging was performed with an upright microscope (BX61WI; Olympus, Tokyo, 
Japan) equipped with an FV1000 laser scanning microscope system (FV1000, 
Olympus) and water-immersion objective lenses (LUMPlanFL N, 60 X, 1.0 N.A.; 
XLPLN25XWMP2, 25 X, 1.05 N.A.). Two mode-locked, femtosecond-pulse 
Ti:sapphire lasers (MaiTai DeepSee and HP; Spectra Physics, Mountain View, 
CA) were used at 1,000 nm for dual-colour imaging (Venus and mREP) and at 
720nm for glutamate uncaging. For three-colour imaging of mTurquoise/ 
GCaMP6/mRFP, the two independently captured images at 780nm 
(mTurquoise and mRFP) and 970 nm (GCaMP6 and mREP) were merged based 
on the identical fluorescence signal of mRFP. For in vitro imaging, 10-40 xy 
images (5 X digital zoom, 512 X 512 pixels) with a z-axis step size of 0.5 um 
were captured. For in vivo imaging, mice were anaesthetized with isoflurane, and 
images (2 X digital zoom, 1,024 X 1,024 pixels) were captured starting at the dura 
and progressing into the brain tissue for up to 650 1m in total with a step size of 
1.0m. For glutamate uncaging, 8mMMNI-glutamate (Tocris Bioscience, 
Bristol, UK) was dissolved in Mg**-free ACSF containing 1 1M tetrodotoxin, 
and using a glass pipette, this solution was applied locally onto the dendrites in the 
presence or absence of 10 1M forskolin (Wako) and 5 uM anisomycin (Sigma). 
Repetitive (5 Hz, 80 X) photolysis of MNI-glutamate in the spine heads was 
performed at 720 nm with a pulse duration of 0.6 ms, and intensity of the uncag- 
ing laser was 6 mW under the objective lens. 

Data quantification. xy images were stacked by the summation of fluorescence 
values at each pixel. For spine size estimation, individual spines on the dendrites 
were traced manually, and fluorescence intensity of the filler (mRFP, DsRed Ex2, 
or mTurquoise) was measured in the spine-head. For each channel, background 
intensity was subtracted from the fluorescence intensity (arbitrary units, a.u.) of 
each spine. During time-lapse imaging, daily variations in the recording condi- 
tions caused slight alterations in the fluorescence intensity, which was corrected 
with the fluorescence intensity changes of the filler along the parental dendritic 
shaft within a distance of 10 j1m from the spine. The ‘Spine enrichment index’ was 
estimated based on the previous report*’. To assess the uneven distribution of 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


PaRacl variants in the dendrite, the “Hot spot index’ was calculated using the 
following equations: 


1 n 
—x S- |(Spine enrichment index;— Spine enrichment index;+1)| 
il 


where ‘Spine enrichment index; and ‘Spine enrichment index;,.,’ represent the 
enrichment indices of a given spine and of its nearest neighbouring spine, respect- 
ively, and ‘n’ represents the number of spines in the measured dendritic branch 
(20 um long). Hot spot index was obtained from the most intensively labelled 
dendritic segments, and estimated by repetitive measurements of sequential near- 
est neighbouring spines. Quantification of fluorescence was performed with the 
Image] software. 

In vivo photoactivation in freely moving animals. Mice transduced by either 
AAV injection or in utero gene transfer were subjected to open-skull cranial 
window surgery, and the cranial holes were covered with bilateral glass windows. 
An outer cylinder (a non-bevelled 15-mm long 18G needle with an inner dia- 
meter of 0.9 mm) was implanted on the glass window for photoactivation. Before 
photoactivation, the optical fibre was inserted into the outer cylinder, and the tip 
of the fibre was placed directly onto the glass coverslip. The fibre and the outer 
cylinder were tightly locked together with Blu-Tack, which was easily removed 
after the photoactivation. Photostimulation was carried out using the COME-2 
series (Lucir, Osaka, Japan), which consist of 457-nm laser diodes, an optical 
swivel, and bilateral optical fibres (COME2-«DF1; core diameter of 500 um, 
0.5N.A.). The laser diode was adjusted to an output of 20 mW at the tip of each 
fibre. The light pulse was delivered for 150 ms at 1 Hz for 1 h, and the process was 
controlled by customised LabView programs (National Instruments, Austin, TX). 
Behavioural analysis. Mice were housed under standard laboratory conditions 
(12-h light/dark cycle with food and water available ad libitum) and were ran- 
domly allocated to experimental groups. All behavioural analyses were per- 
formed during the light phase. For motor learning (Extended Data Fig. 9), we 
used the rotarod training system (Rota-Rod Treadmills ENV-576; Med 
Associates, St. Albans, VT). Before the training sessions, mice were habituated 
to stay on the stationary rod for 2 min. During the training period, the fixed- 
speed protocol was applied at a slow speed (8 rpm), so mice rarely fell off the rod. 
After the mice were able to remain on the rod reliably, the speed was increased in 
a stepwise fashion to 40 rpm. We applied air puffs to the hind limbs as aversive 
stimuli to teach mice to face forward on the rod (Extended Data Fig. 9c), which 
helped them to hold on at higher speeds. After falling, the mice were immedi- 
ately placed back on the rod, and latency of falling was recorded automatically. 
Three training sessions were performed for 2 days (2h for 1 session, 6h of 
training in total). To assess learning, three trials of the rotarod test were carried 
out using an accelerating protocol (4 to 40 rpm) without air puffs with 5 min 
inter-trial intervals. 


For balance beam training, a hand-made beam apparatus was used (Extended 

Data Fig. 9d). Time to cross was scored using a stopwatch. The timer was started 
when the mouse was placed on the beam and ends when the first forepaw was 
placed in the goal cage. Air puffs to the hind limbs were also used to facilitate 
learning. Three training sessions were performed during 2 days (2 h for 1 session, 
6h of training in total). To evaluate the acquired performance, three trails of the 
beam test were carried out without air puffs with 5 min inter-trial intervals. Task 
performances were calculated as the averages of the three trials for both the 
rotarod and beam tasks. Mice with an improvement of < 20% compared to the 
pre-training performance were excluded from the analyses. The running speed of 
mice was measured by a video tracking system (Limelight3; Actimetrics, 
Wilmette, IL). The investigator was not blinded to the group allocation during 
the experiments because all behavioural outcomes were unambiguously deter- 
mined: for example, rotarod performance and locomotion were scored automat- 
ically with infrared or video tracking, and the manual scoring of the cross time for 
the beam test was unambiguous. 
Statistics. A series of experiments were performed as two, mostly three separate 
cohorts, and sample size was chosen based on the effect size shown in the first 
cohort in order to minimize the number of animals used in compliance with 
ethical guidelines. Data are shown as means ~ s.e.m. Detailed information on 
statistical methods/results are described in Extended Data Table 1. In brief, 
Mann-Whitney U tests were used to identify significant differences between 
two groups. Multiple comparisons were made by one-way analysis of variance 
(ANOVA, normal distribution and equal variances), nonparametric one-way 
ANOVA (Kruskal-Wallis test, for unequal variances), or one-way repeated mea- 
sures ANOVA followed by post-hoc Bonferroni test (to compare task perform- 
ance at different time points for within-subjects groups). Spearman rank 
correlation was used to test the strength of correlation between two variables. 
For all statistical tests *P < 0.05, **P < 0.01, ***P < 0.001 were considered sig- 
nificant. No statistical methods were used to predetermine sample size, the 
experiments were randomized and the investigators were not blinded to outcome 
assessment. 


44. Andreassi, C. & Riccio, A. To localize or not to localize: mRNA fate is in 3’UTR ends. 
Trends Cell Biol. 19, 465-474 (2009). 

45. Wang, D. O., Martin, K.C. & Zukin, R. S. Spatially restricting gene expression by local 
translation at synapses. Trends Neurosci. 33, 173-182 (2010). 

46. Holt,C.E.& Schuman, E. M. The central dogma decentralized: new perspectives on 
RNA function and local translation in neurons. Neuron 80, 648-657 (2013). 

47. Gray, N. W., Weimer, R. M., Bureau, |. & Svoboda, K. Rapid redistribution of synaptic 
PSD-95 in the neocortex in vivo. PLoS Biol. 4, e370 (2006). 

48. Bosch, M. et a/. Structural and molecular remodeling of dendritic spine 
substructures during long-term potentiation. Neuron 82, 444-459 (2014). 

49. Hsueh, Y. P., Kim, E. & Sheng, M. Disulfide-linked head-to-head multimerization in 
the mechanism of ion channel clustering by PSD-95. Neuron 18, 803-814 (1997). 

50. Kasai, H. et a/. Learning rules and persistence of dendritic spines. Eur. J. Neurosci. 
32, 241-249 (2010). 


©2015 Macmillan Publishers Limited. All rights reserved 


a 


Experiment 


PaRaci (C450A, light-insentive mutant) 142.14 7 in 


PaRaci (I539E, Lit state mutant) 


ARTICLE 


C original L513K/L531E 
Lit Dark Lit Dark 


— 64 


PaRaci (Original), Dark 
PaRaci (Original), Lit 
PaRac1 (L514K/L531E), Dark 
PaRac1 (L514K/L531E), Lit 


b 


| by 


Extended Data Figure 1 | Optimization of the PaRac1 for the synaptic 
application. a, Isothermal titration calorimetry (ITC) experiments showing 
that the introduction of L514K and L531E mutations into the original PaRacl 
construct’? reduced binding with the CRIB domain of PAK1 in the dark. The 
light-insensitive form of LOV2(C450A) and the 1539E mutant, which mimics 
the unfolded ‘lit state’, were used as negative and positive controls, respectively. 
b, Leaky activity of PaRacl in the dark. In hippocampal neuronal cultures 
transfected with the original PaRacl, we observed a bearded appearance of 
the soma with numerous ectopic dendrites, while neurons transfected with 
PaRacl (L514K/L531E) were indistinguishable from normal neurons. 

c, Assessment of the affinity of PaRacl to the endogenous PAK1 using a pull- 
down assay. HEK293 cells, which were transfected with PaRacl-Venus, 

were divided into two groups: lit and dark. The cells in the lit group were 
radiated with light with a white fluorescent lamp before cell lysis, and 


0.197 + 0.039 8 _ 


| = Ka(uM) | # of trials 
22: 
13 


L513K/L531E 


IP: PaRac1-Venus 
IB: PAK‘ 


i . 
i | 


IB: PaRact-Venus (kDa) 
_ P<0.01 
=) 
& 
Tw 
ot 
O — 
< 
52 
Ot 
2o 
7 6 
ag 
ray 
a 


continuous light illumination was present during subsequent 
immunoprecipitation until the final wash step of protein precipitants. 
Conversely, cells in the dark group were lit with a yellow fluorescence lamp, 
which excludes light wavelengths below 500 nm. Co-immunoprecipitation 
with PAK1 revealed that PaRacl (L514K/L531E) barely bound with PAK] in 
the dark (the number of trials is depicted in the bar graph, **P < 0.01 using the 
Mann-Whitney U test). d, Targeting of PaRacl to the postsynaptic density. 
PSDA1.2-PaRacl (DTE (—) was transfected into dissociated cortical neurons 
at 21 days in vitro (DIV). Two days after transfection, cells were fixed with 
4% PFA, followed by permeabilization for the subsequent immunostaining 
procedure. Axons and endogenous PSD-95 were visualized using the anti- 
phospho-neurofilament and anti-PSD-95 antibodies, respectively, revealing 
that PSDA1.2—PaRacl co-localized with the endogenous PSD-95. Note 

that PSDA1.2-PaRacl did not co-localize with the axonal marker. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


©. 


P7, Rat hippocampal neuron culture Vehicle 
me BIC 
11 12 13 (DIV) x mT Tx 
cc.———Seaaay 
| AS-Venus expression 2 6 ‘ 
B Bicuculline, 20 uM (6 h) e-4|.——4 
or e417 
: ss 1 uM (12 h) $5 
Transfection Imaging li o 
< kk 
Cy Bo) = 
«i ‘: 7 ¥ é Te} 4 
+ RL | j Pe Vehicle a 
* . : 62 
ar oI 
an x 0 
Cc 
*~ S ne) 
= kk 
25100 pa 
: c E 
: i HBS 
sf > Oo 
\ ep) 
. ~-¢ g 
Vehicle Bl 
f AS-PaRaci (PSD-PaRac1-DTE) PSD-PaRac1 
ATG Stop ATG Stop 
CAG alles SG Wenus) PaRacts DTE CAG Zales PGE WV enus)PaRact 
AS cassette 
mRFP mRFP 
um 


’ 


rs = 0.59 
P<0.001 


980 
@ 


Enrichment index 
(AS-PaRac1) 
3 


-@ AS-PaRac1 


| a a es | 
107 10° 10° 
Enrichment index -@ PSD-PaRac1 


(SEP-GluA1) 


-20 0 20 40 60 


-20 0 20 40 60 
i (min) 


(min) 


Stimulated 
Neighbouring| 
Stimulated 
Neighbouring| 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 2 | The distribution of AS-PaRacl is regulated by 
neuronal activity, and is dependent on the dendritic targeting element 
(DTE). a, Experimental design. b, Representative image of a cultured 
hippocampal neuron. c, Bicuculline (BIC) or tetrodotoxin (TTX) was added 
to the culture media at the designated time points. Images were captured at a 
high magnification and were tiled to visualize the entire cell. Green circles 
represent the AS-PaRacl puncta. d, Quantification of AS-PaRacl distribution 
(n = 6 each, *P < 0.05, **P < 0.01 using Kruskal-Wallis test followed by post- 
hoc Dennett’s test). e, Concomitant accumulation of AS-PaRacl and SEP- 
GluA1 in spines. Neurons were co-transfected with mTq (mTurquoise, filler), 
SEP-GluA1, and AS-PaRacl-mREP, and the constructs were expressed for 
36h. Potentiated spines during 36 h were shown by SEP-GluA1 fluorescence 
(arrowheads). Spearman rank correlation revealed a significant correlation 
between the spine enrichment indices of SEP-GluA1 and AS-PaRacl (each 


ARTICLE 


circle represents one spine, 235 spines, 29 dendrites). f, Schematic of the 
constructs and representative images of single spine potentiations by 
glutamate uncaging in the presence of FSK (arrowheads). Rat hippocampal 
slice cultures were biolistically transfected with either AS-PaRacl or PSD- 
PaRacl (DTE (—)) followed by the uncaging experiments at DIV 13 (equivalent 
to postnatal day 20). g, Time course of the spine head volume (V) and 
accumulation of Venus upon uncaging. The mean changes in spine size and 
Venus accumulation in the stimulated or neighbouring spines are depicted 
60 min after uncaging. For quantification, we used pooled data from 
independent identically designed experiments. The data set for AS-PaRacl 
was identical with the FSK-treated group of Fig. 1c—e. Scale bars, 1 um. 

*P < 0.05 using the Mann-Whitney U test (n = 6 or 11 dendrites for 
PSD-PaRacl or AS-PaRacl, respectively). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a PSD-PaRac1 
ATG Stop 


_ PSDA1.2 WetSseleteten) 


Pose NW 
§ 174 
0 = 
; 
From a 
Soma * KX eo 
——p-_ ( LX ; = " 
=? x fis 


e Somatic translation 


e Robust protein expression 
(Generation > Degradation) 


e Diffusion throughout dendrites 


e Integrated into PSD by constitutive turnover 


vila, 
a | 
ha 7 
r) we 
0? 
us ay 
_ -k pea 


e Integrated into PSD by 
structual potentiation 


Proportional expression to the spine size 


b AS-PaRac'1 
ATG Stop 
a PSDA1.2 Venus PaRact@DTE @ ~=Endogenous PSD95 
ats | §@ Venus (probe) 
we . Ribosome 
mRNA of the probe 
we 
ani Proteasome 
From XK Proteasome-dependent 
poll: & ot a probe degradation 
~ab 


(1) A little somatic translation 
— A little diffusion throughout dendrites 
(2) Dendritic targeting element (DTE) of MRNA 
At basal condition ( Degradation > Generation) 
— A little protein expression 
— Little integration into PSD 
(3) Activity-dependent structural potentiation and local translation 


Lactacystin 
ito, w/o Lac w/ Lac at 
Be 
8 


a 00 
VoL 
a rf 
oe O oS 8 1¢e 
Sex fem J 
~ad Py ears 
a a J -= 0 = 


(4) Effective capturing by the potentiated spine 
(5) The probe in PSD — resistant to degradation 
(6) The probe in shaft — sensitive to degradation 


Potentiation specific accumulation 


— Constitutive labelling due to 
the lack of degradation 


Proportional expression to the 
spine size and shaft distribution 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 3 | Putative cellular mechanisms of the specific 
concentration of AS-PaRacl in potentiated spines. a, Uniform labelling of 
spines with the PSD-PaRacl construct that lacks DTE of Arc 3’ UTR (Fig. la, 
construct B). PSD-PaRacl is translated in the soma that is abundantly 
equipped with translational machineries. Therefore, the somatic protein 
expression of the probe is high (data not shown), which would outnumber the 
degradation, and the resulting proteins are transported throughout dendrites. 
The overflowing probes integrate into the postsynaptic density (PSD) during 
the constitutive turnover of PSD molecules. Therefore, probe expression is 
proportional to the spine size. b, Selective labelling of potentiated spines with 
AS-PaRacl (Fig. 1a, construct C). The following six mechanisms endow the 
potentiation-specific labelling with AS-PaRacl. (1) A little somatic translation: 
the moderate gene expression of AS-PaRacl, by which the translation of AS- 
PaRacl protein is limited in the soma (see Extended Data Fig. 2b), and 
therefore, the non-specific overflow of this probe from the soma into the 
dendrites is minimal. (2) Dendritic targeting element (DTE): the essential 
domains of AS-PaRacl are the N-terminal PSD-95 (PSDA1.2) and the 3’ UTR 
of Arc mRNA (DTE). DTE has a pivotal role in the dendritic targeting of 
mRNAs***°. One of the most well-known DTE is present in the Arc mRNA", 
which is targeted to stimulated dendritic segments in an activity-dependent 
manner’®. The transport of mRNA out of soma also contributes to the limited 
translation of the probe in the soma described in (1). In the absence of 
activation, the limited amount of translational machineries and presence of 


ARTICLE 


degradation components in the dendrites maintains the locally translated 
probe at a low level, which results in a low rate of AS-PaRac]1 integration 
into the PSD during the constitutive turnover of PSD proteins. (3) Local 
protein synthesis: persistent structural plasticity of the spine depends on the 
activity-dependent dendritic synthesis of proteins”, and the translation of 
Arc mRNA is controlled by activity levels'’. (4) Effective capturing of PSD 
proteins in the structurally potentiated spines: the potentiated spine, which 
rapidly requires new copies of PSD proteins, captures diffusing PSD proteins 
more efficiently’””*. (5) Increased stability of AS-PaRacl in the PSD: it is 
likely that the stability of the PSD-integrated AS-PaRac] increase, as does the 
typical PSD scaffold proteins’’. The ubiquitination might be underling 
mechanism of the increased stability, because the ubiquitination site of 
AS-PaRacl resides in the N-terminal domain of PSD-95, the domain of which 
is aggregated to form head-to-head multimerization in the postsynaptic 
scaffold”. Thus, once AS-PaRacl is integrated into the PSD, the ubiquitination 
site may be concealed, and AS-PaRacl becomes relatively stable. (6) Sensitivity 
of unbound AS-PaRacl against the proteasomal degradation: contrary to 

the PSD-integrated AS-PaRacl, unbound AS-PaRacl is sensitive to 
degradation because the ubiquitination site is not concealed. This scenario is 
supported by the administration of lactacystin (right panel), which inhibits 
proteasomes and thus completely disrupts the uneven distribution of 
AS-PaRacl. Similar mechanisms are relevant for newly formed spines, because 
spine formation is associated with spine enlargement”. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Enlarged Other 


Classification 


100 


' Enlarged (av 2 50%) 
+AS (+) [AS = 1 in (0 day)] 
AS (-) [AS < 1 in (0 day)] 


Other 

‘No changed (AV < +50 %) 

: Shrunk (AV 2 -50 %) 
‘Eliminated (V= Background level) 


DsRed [Spine size, V, (a.u.)] 
DsRed [Spine size, V, (a.u.)] 


10 10' 10 
AS (0 day) 


oO 


(+ 2 day) Cc 


L\soma with AS expression 


© AS “Before” learning 

@ New AS during “Learning” 
© New AS “After” learning-1 
@ NewAS “After” learning-2 


: 


vii: Spontaneously activated neurons per field (%) = (iii) / (i) x 100 
viii: Learning-evoked neurons per field (%) = {(iv) - (iii)} / (i) x 100 
ix: Learning-evoked AS (+) spines per field (%) = {(vi) - (w)} / (ii) x 100 
x: Learning-evoked AS (+) spines per the learning-evoked neuron = (ix) / (witi) x 100 


AS-Venus 


# of AS puncta 
NO k Oo 
oO oO 
fo) oS 


(- 1 day) 
(0 day) 
(+ 1 day) | 
(+ 2 day) 


= 
yO aN Oo 
a on Oo 


(%) fraction 


jo) 


aN 
> > 
G 
Tw TD 
nS) 
1 = 
~~ 


(+ 1 day) 
(+ 2 day) [i 


[icon L_AS/DsRed Ex2 


i: # of filler-labeled soma 

ii: # of filler-labeled spines 
iii: # of AS (+) soma in (-1 day) 
iv: # of AS (+) soma in (0 day) 
v: # of AS (+) spines in (-1 day) (e.g.) estimation in b, ¢ 


vi: # of AS (+) spines in (0 day) 1=51 iii = 3 v= 346 vii = 5.88 % ix = 2.33 % 
ii = 16,037 iv=9 vi= 720 viii = 11.8 % x= 19.8% 
viii: Learning neuron (%) ix: Learning spines (%) _x: Learning spines / Learning neurons (%) # of mice 
Layer II/III neuron 16.4+2.8 2.3 + 0.13 14.71 + 2.01 5 
Layer V neuron 22.57 + 2.8 1.15 +0.27 5.01 + 0.76 4 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 4 | Raw data of quantification and synaptic mapping. 
Data from Fig. 2. a, Quantification of spine size (based on DsRed fluorescence) 
and AS-PaRacl fluorescence after learning are depicted separately based on the 
classification of spines. The definitions of “New spine’, ‘Enlarged spine’ and 
others are described on the right. Each arrow indicates the trajectory of a spine; 
beginning and end points represent the absolute values before and after the 
rotarod task, respectively. b, xy images were captured from the dura to a 
depth of 300 tm with a step-size of 1.0 jum, and were stacked by the summation 
of fluorescence values at each pixel. z-stacked images of 10 overlapping fields 
were aligned to generate the combined images. AS-PaRacl and AS-PaRacl/ 


ARTICLE 


DsRed merged images are shown. AS-PaRac] that was present before learning 
(—1 day, yellow), appeared shortly after learning (learning period, 0 day, 
green), 1 day (after-1, +1 day, blue), or 2 days after learning (after-2, +2 day, 
purple) are depicted to show the spatiotemporal distribution of AS-PaRacl 
triggered in each period. c, Time course of the number and fraction of 
AS-PaRacl-positive spines in each period. d, Calculation of the learning- 
evoked spine/neuron ratio (%). Example of the calculation is based on the 
raw data shown in b and c. The table indicates the comparison between neurons 
in layer II/III (in utero electroporation at E14.5) and layer V (in utero 
electroporation at E13). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


K oR Ne 


. 
, 


Spine /' Shaft synapse 
Filler t, AS t me; —,AS t 


[FilevAS] [Ro] C 


io ROI for the shaft punctum (/) 
o——> ROI for the spine (ii) 

o—$> ROI for other shaft puncta 
120 
100 


ee 


+ (Before) 
------<« (After) 


(Before) 


I 
1 
1 
1 
1 
i] 
1 
~~ 1 
3 |} 
i] 
© go] | 
ro) H 
x |} 
© 604 | 
& I 
£ ft 
@ 407 | 
5 1 
3 |i 
> 
© 20]! 
ee, ® i] 
i = 1 
g cz shy — 
xt f eae level 
~— = 
10 10° 10° 10° 


AS value in the ROI (a.u.) 


Extended Data Figure 5 | Assessment of AS-PaRacl punctaon the dendritic _ the calculation of fluorescence in each punctum is shown. c, Quantification 
shaft. a, The two possible synapse types that AS-PaRacl puncta may represent _ of the fluorescence of the filler and AS-PaRacl upon the emergence of 

on the dendritic shaft. xy images were captured to encompass the entire AS-PaRacl puncta. Each arrow indicates the trajectory of each ROI; beginning 
z-range of the dendrite of interest with a step-size of 0.5 1m, and images were _ and end points represent the absolute values before and after the emergence 
stacked by the summation of fluorescence values at each pixel. The fluorescence _ of AS-PaRacl, respectively. The ROI at (i) exhibited a concomitant 

of both the filler and AS-PaRacl would increase, if the AS-PaRacl punctum fluorescence increase in both the filler and AS-PaRacl, similar to AS-PaRacl 
emerged on the dendritic spine that undergoes structural potentiation. In in a typical dendritic spine (ii). All examined AS-PaRacl puncta on the 
contrast, fluorescence of the filler would not increase, if AS-PaRacl wasin the dendritic shaft exhibited positive correlations, suggesting that the majority 
shaft synapse. b, Example of the dendrites before and after the emergence of | of AS-PaRacl puncta emerge on the dendritic spine during the structural 
AS-PaRacl. AS-PaRacl puncta on the shaft and on the dendritic spine are changes of the spine. 

indicated with (i) and (ii), respectively. The region of interest (ROI) used for 


©2015 Macmillan Publishers Limited. All rights reserved 


a PA protocol 


5 min 
458 nm laser, 200 usec 
1 2 | 5 sec | | 
2 trains 1 2 10 


b 


Before PA 


ARTICLE 


Cc Spine w/ venus 
150 


(%)V 


PA 


} After PA 


20.0 20 40 60(min) 


Spine w/o venus 


150 
- ‘ \ > 1 
CAG::AS-PaRac1 +7] >} ¥ io 
f t t 
ATG PaRac1 Stop eds | ad. : ei. 50 
CAG Salis PGE Venus) LOV al Ract DTE { t t 
| } -20 0 20 40 60(min) 
; ; > 
SARE::AS-PaRac1 d w/ venus. w/o venus 
ATG PaRac1 Stop 7 bi 
SARE; ArcMin calles ® an ens) LOV) Bl Rac DTE < 
> 9 Q 
As 
& € -40 
SARE::PSDA1.2-LOV-DTE = 
i<o} 
ATG Stop -80 
SARE} ArcMin gules. ens) LOV p> DTE 4 


Extended Data Figure 6 | Racl-dependent shrinkage of dendritic spines 
induced by low-frequency photoactivation. a, The protocol of 
photoactivation. Photoactivation was performed in the region that 
encompasses the branch of interest. b, Neurons in the hippocampal slice culture 
(DIV 11) were biolistically transfected with DNA constructs shown in the 
schematic image on the left. Representative dendritic images upon 
photoactivation are shown on the right. Robust shrinkage (arrowheads) was 
observed in the spines transfected with AS-PaRacl driven by the SARE-Arc 
promoter. Despite their adjacent location to the AS-PaRac1-positive spines, 


-15 min 


k KKK 


CAG::AS-PaRac1 
MB SARE::AS-PaRac1 
@§ SARE::PSDA1.2-LOV-DTE 


1 min 10min 30min 


AS-PaRacl-negative spines were not affected by the photoactivation. c, Time 
course of the spine head volume (V) of Venus-positive (upper panel) and 
negative spines (lower panel). White, red, and blue circles represent CAG::AS- 
PaRacl, SARE::AS-PaRacl, and SARE::PSDA1.2-LOV-DTE, respectively 

(n = 12 cells each). d, The mean relative change in spine head size in Venus- 
positive and negative spines 60 min after photoactivation. Scale bars, 2 um. 
*P < 0.05 and ***P < 0.001 according to the Kruskal-Wallis test followed 
by the post-hoc Scheffé’s test. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


4 b PA 
__BeforePA Fiver ba worse areata) 
oo Reson 
= 1.0mm, a iT 
Center eases AP -1.1, ML +1.0 aii 


Optical fiber > 100 
f core @ = 500 um 3 
ere N.A. = 0.5 & 75 
> 
| Outer cylinder @ 50 
Cement] ~ 
Headgear @ 2 
£& 
fom 
oO 0 


Glass window, thickness < 0.1 mm 


Before PA 
After PA 


C ‘ d Re-training — Home cage eC. va ASC )sBine |, AS () spine 
500 xx 01S = 300 300 — AS (+) spine Ds * 140 HM Re-training 
n o — AS(-)spine 120 ™ ~~ 120 Home cage 
< 400 © © aa 
£s © 200 200 2 100 
S 300 > 2 > a 80 80 
& 200 =a So 60 60 
< ~ = 100 100 oe 40 
a o Y 
@ 100 > 2 35 
Pra = 3 20 
< : N o 0 e D ? D 
> of 3 “fos ££ 2 x £ g ££ £ 
6 TD Of o5 5 o5 5 eo § & o 5 = 
Oo of og ges se gg & ge & 
5 g<s ae a < s$ g* $ 
Ss o . . 
<x ® o ig) o 
~ x < x x 
f Layerll/IlI LayerV 


mRFP/Emx1/TO-PRO3 fi 


2 ® 

== 50 

a8 

i = 6 + ‘ : a ae 

E & a= oy mRFP/Emx1/TO-PRO3 mRFP/Emx1/TO-PRO3 

£8 ge Af 

eS SS 
Extended Data Figure 7 | Spine shrinkage in broad areas of the bilateral re-training and home cage protocols shown in Fig. 5. The majority of 
motor cortices induced by blue laser illumination. a, Schematic of the AS-PaRacl-positive spines displayed photoactivation-induced shrinkage and 
bilateral cranial windows, optical fibres, and the photoactivation protocol. subsequent recovery. *P < 0.05 according to the Mann-Whitney U test. 
b, Representative images of spine shrinkage in the M1 cortex upon f, The success of AAV5 vector injection into the bilateral M1 cortex was 


photoactivation in vivo. AS-PaRacl-positive spines (green arrowheads) shrank, confirmed by the presence of mRFP fluorescence after behavioural tests. High 
while the AS-PaRac1-negative ones (white arrowheads) did not. Quantification _ efficacy of virus infection in layer II/III and V pyramidal neurons was 


of spine size is shown on the right. c, The mean number of AS-PaRacl demonstrated with Emx1 immunostaining, which labels pyramidal neurons. 
puncta per fields was calculated in mice shown in the Fig. 4i. d, e, Spine The mice without bilateral mRFP signal in the M1 cortex were excluded from 
structure and AS-PaRacl were imaged in mice, which were subjected to the the data analysis. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


PA 
a (Protocol #1) 
P28 P60~100 Day1 Day2 Day3 
any injection. “Cranial Before Pre- Post- After After 
into M1 cortex surgery PA training tone PA PA 
a 
Mi RotaRod training 
(1 session) 
PA 
{ RotaRod test 
t Locomotion test 
Before PA After PA 


(Protocol #1) 


CK::mRFP 
Arc::AS-PaRac1 


P=0.1 
| | 
40 
— 
g 
€ 30 
Oo 
0 3 
. Before PA ® 20 
= © 
# Ree fom 
= } 
ce 10-Raaie Es 7) 
S) PUR Eee a ulate { x 
~~ o— yy - It A rir aaa 1 oc 10 
3 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 S 
® 30 
a o! « < 
20 Oo . 
_ | i 5 
f 1H} n> = 
havin HIP WA MAMTA ah Natt ® < 
~ om — I) i re f ' J J ‘ al Ly fy vt jaa) 
20 40 60 100 120 140 160 180 200 220 240 260 280 300 


Time (s) 


Extended Data Figure 8 | No effect of photoactivation on the locomotor 
activity of mice. a, Experimental schedule. The running speed of AS-PaRacl- 
injected mice in protocol no. 1 (Fig. 4a) was measured with a video-tracking 
system. To minimize the effect of circadian rhythm on locomotion, mice were 


tested at the same time of the day before and after photoactivation. 
b, Representative traces of locomotion and temporal sequences of running 


speed are depicted. c, Statistical analysis shows that photoactivation has only 
a negligible effect on running speed. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 


Test Test 


(Pre-training) -> Ltalning -> (Post-training) ea (After PA) sz 
(Average of 3 trials 
for skill judgement) 


(Average of 3 trials 
for skill judgement) 


(Average of 3 trials 
for skill judgement) 


Exclusion of “Failure” individual 


Aperformance < +20 %, (relative to Pre-training) 


b For “Protocol #1” 
P28 P60~100 


AAV5 injection Cranial 
into M1 cortex — surgery..." 


<5 min 


Rotarod test Rotarod training : 
i (Pre-training) (1stsession) : 
i Locomotion 
i (Before PA) 


After training 
Typical well-trained mouse 
Forward on the rod 


ae of 


Before training 
Typical untrained mouse 
Backward on the rod 


\) 


rotation rotation 
Rod Rod 
Air puff as aversive stimuli no air puff 


(only during “Training”, 
but not during “Test”) 


at this posture 


Extended Data Figure 9 | Detailed illustration of the rotarod and beam 
tasks. Experimental setup for Fig. 4. a, Experimental flowchart. b, Detailed 
schedule of the rotarod training/test, locomotion test, and photoactivation. 
c, To shorten the training time, air puffs were applied to the hind limbs as 


<5 min 


iRotarod training Rotarod test 


Test Perfusion fixation 
Examination of AAV 


infected area 


Y 


Exclusion of infection failure 
(an individual w/o bilateral M1 injection) 


Data 
analysis 


> 


i RotaRod training 


(1 session = 2 h) 
4 RotaRod test (3 trials) 


4 Locomotion test 


fren 


Photoactivation (PA) { Rotarod test 
i (3rd session) (Post-training) (150 msec, 1 Hz x 3600 = 1h) (After PA) 
i Locomotion 
(After PA) 


A black box filled with nesting 


Air puff when mice stop materials from home cages 
(only during “Training”, 
but not during “Test”) 


a 
—_ 


Beam (80 cm beam with 


70 cm a flat surface of 8 mm, pliable) ) 


aversive stimuli to maintain the forward-looking position of mice on the rod, 
which improved the performance, especially at higher speeds. d, Schematic 
illustration of the beam test. The test was preceded by a 6-h-long training 
session that lasted for 2 days. 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 1 | Detailed information on sample descriptions and statistics 


ARTICLE 


©2015 Macmillan Publishers Limited. All rights reserved 


Sample Statics 
Description Size (n) Methods Comparison P values Correlation coefficient 
Construct (A) = 13 dendrites/13 slices/3 rats Enrichment} Hot spot 
Construct (B) = 20 dendrites/20 slices/6 rats A ‘ (A) vs (B) 0.03896 | 0.48912 
‘ = = : 7 One-way factorial ~ 
Figure la Construct (C) = 23 dendrites/23 slices/6 rats ANOVA (A) vs (C) 0.00001 0.00000. 
Construct (D) = 8 dendrites/8 slices/3 rats reherer er (A) vs (D) 0.69713 | 0.81024 
Hippocampal {Construct (E) = 8 dendrites/8 slices/3 rats (p (A) vs (E) 0.00130 | 0.00929 
slice culture mRFP Venus 
Fi Ib-d + uncaging alone (A) = 15 dendrites/15 slices/6 rats Kruskal Wallis test (A) vs (B) 0.62258 | 0.01139 
‘eure "| Gene Gun _funcaging + FSK (B) = 35 dendrites/35 slices/8 rats (post-hoc Scheffe's (A) vs (C) 0.00430] 0.10843 
uncaging + Aniso (C) = 20 dendrites/15 slices/6 rats test) (B) vs (C) 0.00000 | 0.00000 
Enrichment} Hot spot 
j in ee = 13 jles/13 slic ann Whi 
Figure le Vehicle = 1 dendrites/t pslices/2 rats Mann: ‘Whitney test Veh vs Lac 0.00134! 0.00030 
Lactacystin = 8 dendrites/8 slices/3 rats (two-sided) 
Enlarged New Shrunk {Eliminated 
Fioure 2d ining = 2793 spines/7 mic ann Whi any 
igure (Training 79: Spines/7 mice Mann. ‘Whitney test (Training) vs 0.03887} 0.02014] 0.12134] 0.12134 
2 (No training = 718 spines/3 mice (two-sided) (No training) 
In vivo MI AV (Oday) & 
Figure 2f cortex ay) 0.00000 0.61278 
sae A ‘ Spearman's rank AS (0 day) 
a Training = 2090 spines/3 mice eotietation Goeth AV (0 day) & 
Figure 2g | In utero EP O day, 0.51746 -0.03549 
‘ AS (-1 day) 
(E14.5) 
Enlarged New 
Fi 2k 68 spines larged ines) out of 2090 total [Mann-Whitney test AS (+1 day)21 
igure 6 spines (en TEC or new spines) out o1 ota ann hitney test [AS (+1 day)21] 0.00263! 0.00016 
spines for Fig 2e—j (two-sided) vs [AS (+1 day)<1] 
2 In vivo M1 Mann-Whitney test [AS (4)] vs 
Figure 3c ss 0.00000 
cortex : ; (two-sided) [AS (-)] 
: 94 spines/6 mice 5 se Tank 
' 4 pearman's rai ie 
3 .2815 -0. 
Figure 3d In utero EP correlation coefficient AVE Depth O28} ols 
AS (4) AS(-) AS (+)_|_AS(-) 
i 3i,j i a é s " % . iy i 5 23: 
Figure 33, j Hippocampal 24 spines/12 slices/6 mice Spearman's Tank : AV & AAmp 0.01793 0.57016 0.58235 | 0.23810 
slice culture correlation coefficient AV & AFreq 0.58679 | 0.95537 -0.14706 | 0.02381 
+ Amplitude _| Frequency 
i 3i,j Gene Gun ile i ‘a S 
Figure 3i, j i 6 notialiOah Wilcoxon signed rank (Before PA) vs 0.24886 | 0.09620 
test (After PA) 
mRFP AS AS AS 
(Prot #1) | (Prot #1) | (Prot #2) | (Prot #3) 
(Pre-training) vs | 4 49990} 0.00000} 0.00000 | 0.00034 
(0 day) 
(0 day) vs NA NA] 0.83880 NA 
(+1 day) 
aes ea pee . NA NA Na|_ 1.00000 
Figure 4b-d ne ANOVA 7 
paca . (0 day) vs 0.59421] 0.00056] 0.00003} 1.00000 
(post-hoc Bonferroni (After PA) 
test) (+1 day) vs 
(Afer PA) NA NA} 0.0000€ NA 
#1) = i +2 da ‘ 
mRFP alone (Protocol #1) = 10 mice ; (+2 day) vs NA NA Nal 1.00000 
mRFP + AS-PaRacl (Protocol #1) = 15 mice (After PA) 
mRFP + AS-PaRac1 (Protocol #2) 2'mice (Pre-training) vs 0.00005 0.00070! 0.01852} 0.00012 
ImRFP + AS-PaRacl (Protocol #3) = 5 mice (After PA) 
(Prot #1) vs 0.22430 
. One-way factorial (Prot #2) shies 
Inyo MI ANOVA (Prot #2) vs 
Fi 4 cortex es 0.00047 
ae? ta (post-hoc Scheffe's (Prot #3) 
Bilateral AAVS ee Ce 0.00989 
ae (Prot #3) 
infection 
mRFP AS AS AS mRFP AS AS AS 
Figure 4f-h (Prot #1) | (Prot #1) | (Prot #2) | (Prot #3) | (Prot #1) | (Prot #1) | (Prot #2) | (Prot #3) 
pacaumen Tan, po cpecand 0.74518] 0.04664] 0.00156] 0.95715] 0.1051} -0.5206} -0.8056] 0.0286 
correlation coefficient _| learning attainment 
Non-PA | Non-PA PA PA 
(RotaRod) | (Beam) |(RotaRod)| (Beam) 
(Pre-training) vs 
= 0.00000 0.00000 | 0.00000 | 0.00000 
(0 day) 
(aay) ve 1.00000 NA] 1.00000 NA 
Figure 4j One-way repeated (+2 day) 
measures ANOVA (0 day) vs 
e 1.00000 1.00000 | 1.00000) 0.04755 
Non-PA group = 13 mice (post-hoc Bonferroni (After PA) 
PA =13 mi test +2 di i 
eae est) G2 ey) ve 0.66982 NA 1.00000 NA 
(After PA) 
(Pre-training) vs | 4 49990} 0.00000} 0.00000 | 0.00000 
(After PA) 
Figure 4k Spearman's rank ; PA effect on each 0.06148 0.53168 
correlation coef task 
Both Day 4 
Day2/4 specific 
(Dual) vs 
Fi 5d. oO vay factoral (Re-training) 0.09206 0.21206 
a. ae Dual task group = 1713 spines/5 mice Neva pel menses — 
bm Re-training group = 765 spines/S mice ss (Berane) 0.00708] 0.57905 
é . (post-hoc Scheffe's (Homecage) 
Home cage group = 861 spines/S mice 
test) (Dual) vs ¥ 
0.35558 0.03786 
(Homecage) 
Dual task |Re-training] Homecag 
: al task group: 
In vivo M1 Dual task coup: / (Day2) vs 
Day2 specific spines = 48, Both Day2/4 spines =13, 0.00331 0.04339 | 0.00147 
cortex ‘ : (Both Day2/4) 
Day4 specific spines = 26 P P 
Figure 5e, h + Re-Gainnl = One-way factorial 
| Inutero EP |SS Haine Brou: : ANOVA 
k x Day2 specific spines = 10, Both Day2/4 spines =15, is (Day2) vs (Day4) 0.00077 0.04579 | 0.01531 
(E14.5) apes, (post-hoc Scheffe's 
Day4 specific spines = 9 test) 
Home cage group: ‘i 
eine Both Day2/4) vs 
Day2 specific spines = 10, Both Day2/4 spines =6, ea Baa DS | o.o9817] 0.95804} 0.65233 
Day4 specific spines = 9 (ay! 
(Dual) vs 
ts 0.00000 
: (Re-training) 
Dual task group = 5 mice Chi-squared test (Post- Re-training) 
Figure 5n Re-training group = 5 mice hoc Bonferonni 7 ieee 1.00000 
Home cage group = 5 mice correction) : Dull — 
ene 0.00000 
(Homecage) 


ARTICLE 


doi:10.1038/nature14877 


Panorama of ancient metazoan 
macromolecular complexes 


Cuihong Wan!?*, Blake Borgeson”*, Sadhna Phanse’, Fan Tu”, Kevin Drew’, Greg Clark°, Xuejian Xiong*®, Olga Kagan’, 
Julian Kwan, Alexandr Bezginov’, Kyle Chessman*°, Swati Pal®, Graham Cromar*, Ophelia Papoulas”, Zuyao Ni, 
Daniel R. Boutz’, Snejana Stoilova', Pierre C. Havugimana', Xinghua Guo', Ramy H. Malty®, Mihail Sarov’, 

Jack Greenblatt*, Mohan Babu®, W. Brent Derry*”, Elisabeth R. Tillier®, John B. Wallingford”, John Parkinson*”, 


Edward M. Marcotte”® & Andrew Emili* 


Macromolecular complexes are essential to conserved biological processes, but their prevalence across animals is 
unclear. By combining extensive biochemical fractionation with quantitative mass spectrometry, here we directly 
examined the composition of soluble multiprotein complexes among diverse metazoan models. Using an integrative 
approach, we generated a draft conservation map consisting of more than one million putative high-confidence 
co-complex interactions for species with fully sequenced genomes that encompasses functional modules present 
broadly across all extant animals. Clustering reveals a spectrum of conservation, ranging from ancient eukaryotic 
assemblies that have probably served cellular housekeeping roles for at least one billion years, ancestral complexes 
that have accrued contemporary components, and rarer metazoan innovations linked to multicellularity. We 
validated these projections by independent co-fractionation experiments in evolutionarily distant species, affinity 
purification and functional analyses. The comprehensiveness, centrality and modularity of these reconstructed 
interactomes reflect their fundamental mechanistic importance and adaptive value to animal cell systems. 


Introduction 


Elucidating the components, conservation and functions of multi- 
protein complexes is essential to understand cellular processes’”, 
but mapping physical association networks on a proteome-wide scale 
is challenging. The development of high-throughput methods for 
systematically determining protein-protein interactions (PPIs) has 
led to global molecular interaction maps for model organisms 
including E. coli, yeast, worm, fly and human*"°. In turn, comparative 
analyses have shown that PPI networks tend to be conserved’’”’, 
evolve more slowly than regulatory networks’’, and closely mirror 
function retention across orthologous groups'''*"*. Yet fundamental 
questions arise’®'’. Here we define: (i) the extent to which physical 
interactions are preserved between phyla; (ii) the identity of protein 
complexes that are evolutionarily stable across animals; and (iii) the 
unique attributes of macromolecule composition, phylogenetic 
distribution and phenotypic significance. 


Generating a high-quality conserved interaction dataset 


As previous cross-species interactome comparisons, based on experi- 
mental data from different sources and methods, show limited over- 
lap’*"*, we sought to produce a more comprehensive and accurate 
map of protein complexes common to metazoa by applying a stan- 
dardized approach to multiple species. We employed biochemical 
fractionation of native macromolecular assemblies followed by tan- 
dem mass spectrometry to elucidate protein complex membership 
(Fig. 1; see Supplementary Methods). Previous application of this 
co-fractionation strategy to human cell lines preferentially identi- 
fied vertebrate-specific protein complexes®, so we selected eight 
additional species for study on the basis of their relevance as model 


organisms, spanning roughly a billion years of evolutionary diver- 
gence (Fig. 1a). The resulting co-fractionation data (Fig. 1b) acquired 
for Caenorhabditis elegans (worm), Drosophila melanogaster (fly), 
Mus musculus (mouse), Strongylocentrotus purpuratus (sea urchin), 
and human were used to discover conserved interactions (Fig. 1c), 
while the data obtained for Xenopus laevis (frog), Nematostella 
vectensis (sea anemone), Dictyostelium discoideum (amoeba) and 
Saccharomyces cerevisiae (yeast) were used for independent valid- 
ation. Details on the cell types, developmental stages and fractionation 
procedures used are provided in Supplementary Table 1. 

We identified and quantified (see Supplementary Methods) 13,386 
protein orthologues across 6,387 fractions obtained from 69 different 
experiments (Fig. 2a), an order of magnitude expansion in data 
coverage relative to our original (H. sapiens only) study®. Individual 
pair-wise protein associations were scored based on the fractionation 
profile similarity measured in each species. Next, we used an integ- 
rative computational scoring procedure (Fig. 1c; see Supplementary 
Methods) to derive conserved interactions for human proteins and 
their orthologues in worm, fly, mouse and sea urchin, defined as 
high pair-wise protein co-fractionation in at least two of the five 
input species. The support vector machine learning classifier used 
was trained (using fivefold cross-validation) on correlation scores 
obtained for conserved reference annotated protein complexes 
(see Supplementary Methods), and combined all of the input species 
co-fractionation data together with previously published human*” 
and fly interactions’ and additional supporting functional association 
evidence (HumanNet). Measurements of overall performance 
showed high precision with reasonable recall by the co-fractionation 
data alone (Fig. 2b), with external data sets serving only to increase 


1Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada. Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, 
University of Texas at Austin, Austin, Texas 78712, USA. ?Department of Medical Biophysics, Toronto, Ontario MSG 1L7, Canada. Department of Molecular Genetics, University of Toronto, Toronto, Ontario 
M5S 1A8, Canada. Hospital for Sick Children, Toronto, Ontario M5G 1X8, Canada. ®Department of Biochemistry, University of Regina, Regina, Saskatchewan S4S OA2, Canada. 7Max Planck Institute of 
Molecular Cell Biology and Genetics, 01307 Dresden, Germany. 8Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas 78712, USA. 


*These authors contributed equally to this work. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 339 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a b 


6,387 


Samples Fractions 


96 Mya* 


Homo 
sapiens 


386-393 Mya® 
572-657 Mya? 
587-668 Mya’ 
642-761 Mya’ 
761-957 Mya’ 

872-1127 Mya‘ 


Mus 


@ musculus 


Xenopus 
we laevis ® 


Strongylocentrotus 
purpuratus 


Deuterostomes| 


Caenorhabditis 


elegans ee 
LE Drosophila 


Protostomes 


Metazoa 


fractions 


c 


> cvsns 


Fractionation via 
SGF, IEF, IEX, etc. 


candardizedaay 


Standardize data: 
map to human 


(1) Calculate 
correlations 


(2) Machine 
learning 


(3) Clustering of 
high-confidence 
interactions 
into complexes 


Conserved 
complexes < 


RNA Pol Il 
(training) 
SHURRAHEP LEME EDA HO 
erestr! pe ewex en 
CN 6 OS LOND PERLE KOE 
Exosome EON Be BBS e so ay AG eo 


SUI9}01d 


melanogaster 


(known) BOD met E 4 hy Ady eee 
GAA L REA DAY O ERE NY WEA Y 
wm MEAT RY AEN Lo 


Opisthokonta 


Cnidarians 
Nematostella 
vectensis @) 


Commander yaay 
e—e FE 


| OE EBL /B 


Dictyostelium 
@® discoideum 


cerevisiae . 
@ Protists 


Figure 1 | Workflow. a, Phylogenetic relationships of organisms analysed in 
this study. We fractionated soluble protein complexes from worm (C. elegans) 
larvae, fly (D. melanogaster) S2 cells, mouse (M. musculus) embryonic stem 
cells, sea urchin (S. purpuratus) eggs and human (HEK293/HeLa) cell lines. 
Holdout species (“T’, for test) likewise analysed were frog (X. laevis), an 
amphibian; sea anemone (N. vectensis), a cnidarian with primitive eumetazoan 
tissue organization; slime mould (D. discoideum), an amoeba; and yeast 

(S. cerevisiae), a unicellular eukaryote. b, Protein fractions were digested and 


precision and recall as we required all derived interactions to 
have extensive biochemical support (see Supplementary Methods). 
Co-fractionation data of each input species affected overall perform- 
ance, in each case increasing precision and recall (Extended Data 
Fig. la). The final filtered interaction network consists of 16,655 
high-confidence co-complex interactions in human (Supplementary 
Table 2). All of the interactions were supported by direct biochemical 
evidence in at least two input species, with half (8,121) detected in 
three or more (Extended Data Fig. 1b), enabling cross-species mod- 
elling and functional inference. 


Benchmarking protein complexes 


Multiple lines of evidence support the quality of the network: ref- 
erence complexes withheld during training were reconstructed with 
higher precision and recall (Fig. 2b; see Extended Data Fig. 1c) relative 
to our human-only map’. The interacting proteins were also sixfold 
enriched (hypergeometric P<1X10~~*) for shared subcellular 
localization annotations in the Human Protein Atlas Database”', 
21-fold enriched (P< 1 10~°°) for shared disease associations in 
OMIM”, and showed highly correlated human tissue proteome 
abundance profiles” (Extended Data Fig. 2a). 

To independently verify the reliability of these projections, we 
examined the co-fractionation profiles of putatively interacting 
orthologues (interologues) in the four holdout species, as obtained 
by protein quantification across 1,127 biochemical fractions (see 
Supplementary Methods). Whereas sequence divergence changed 
absolute chromatographic retention times (Extended Data Fig. 2b), 
most of the predicted interactors showed highly correlated 
co-fractionation profiles among the holdout test species to a degree 
comparable to those of the input species used for learning (Fig. 2c). 
The biochemical data obtained for frog and sea anemone showed 
slightly better agreement than that for Dictyostelium and yeast that 
was proportional to evolutionary distance”. 

Besides indicating stably associated proteins, our multispecies 
biochemical profiles faithfully recapitulated the architecture of 


340 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


Proteomic profile 


(novel) rf Xa b> " 
or ee 
& A dy te 


analysed by high-performance liquid chromatography tandem mass 
spectrometry (LC-MS/MS), measuring peptide spectral counts and precursor 
ion intensities. c, Integrative computational analysis. After orthologue mapping 
to human, correlation scores of co-eluting protein pairs detected in each 
‘input’ species were subjected to machine learning together with additional 
external association evidence, using the CORUM complex database as a 
reference standard for training. High-confidence interactions were clustered 
to define co-complex membership. 


multiprotein complexes of known three-dimensional structure, with 
a general trend for most correlated protein pairs to be spatially closer 
(Extended Data Fig. 2c). For example, hierarchical clustering of 
30S proteasome subunits according to chromatographic elution 
profiles of all five input species correctly separated the 20S and 
19S particles and the regulatory lid from the base sub-complex 
(Fig. 2d), reflecting known hierarchies of complex formation and 
disassembly. 


Landscape of interaction conservation across species 


Because most of the interacting components were phylogenetically 
conserved across vast evolutionary timescales, we were able to predict 
over one million high-confidence co-complex interactions among 
orthologous protein pairs for 122 extant eukaryotes with sequenced 
genomes (Supplementary Table 3). The number of interactions 
ranged from ~8,000 to ~15,000 per species depending on phyla 
(Fig. 2e), with more projected among Deuterostomes, Protostomes 
and Cnidaria, which show high component retention, and fewer in 
Fungi, Plants and, especially, Protists, where the relative paucity of 
co-complex conservation probably reflects inherent clade diversity, 
especially in parasite genomes (for example, gene loss among 
Apicomplexa). While largely congruent with previous smaller-scale 
studies of PPI conservation”, the majority of conserved 
co-complex interactions are novel (less than one-third curated in 
CORUM, STRING and GeneMANIA databases; Fig. 2e). This mark- 
edly increases the number of metazoan protein interactions reported 
to date (Supplementary Table 3), covering roughly 10%-25% of the 
estimated conserved animal cell interactome**”’, opening up many 
new avenues of inquiry. 

To systematically define evolutionarily conserved functional mod- 
ules, we partitioned the interaction network using a two-stage cluster- 
ing procedure (Fig. 1c; see Supplementary Methods) that allowed 
proteins to participate in multiple complexes (that is, moonlighting) 
as merited (Extended Data Fig. 3a). The 981 putative multiprotein 
groupings (Fig. 3a; see Supplementary Table 4) include both 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 13,385 proteins Scale of project c 
1.0 Alldata —= 
g Fractionation only == 2 
i>} = 
ey External dataonly —=- 5 
= b ic 
2 & 0.8 5 
2 Ei t a= 
& 8 © 06 5 
= et - f 3 
a = 
Ss 04} & 
‘o 8 
oO t 
& 0.2 & 
© “rT ° 75 
[<2 
8 k 
= 0.0 0) 0 
0.0 04 02 03 04 1-050 051 -1-050 051 
Recall: TP/(TP+FN) Co-elution score 
d Representative proteasome subunit profiles Correlation matrix Hierarchical clustering dendrogram 
i ! | |D6 ye iL 4 
ro [a ar ra 2. as 
{ i a a ~ ae | E : 
re er eer ae a 
! a! { [o12| sot nlf D12 i : 
1 i. ! Be eat lf mt a_._1D7 i. 1 
H a_ha ! a “ee ee (21) A Ia 
2 [egress Cd a Ica a ass 
a H Lane 1 sla ic2 a See 
t - yet ae aa 1 iin 
q yf b2 a ee a 
i ay Reeriailcs ie Ios a ee 
! | Sr ae : : 
i 1 1a] C6, a tlm 2 1C6 : : 
H Hh 1 |d14 a a Di4 ‘- 
H mW t ba H ae, D4 rr eee 
H ib L L A AS E a ae 
= 7 Sa 
o -@)— rs a a 
& a ae a Be ae Ba 
Sd i : B3 =a ae 
= i Saat BS ges 
i aD! a BI t t 
1 1 |82 ii 1 B2 H H 
en ST t i 87 ; ; 
VQ 1 1 |87 = i AT ‘ Q 
\ je | SS ie 
! eg 3 {as pare 
q rT At aT At ae 
! ! p08 | ps | . 
0 40 60 80 20 40 60 20 0 60 «6-80: SRSEXYLRLHSRSLESLLSLBRLAKLAY 08 O07 06 
e Brecuons 40 09 08 07 06 05 04 03 02 0.1 Correlation coefficent 
S16 
Qa | 16,655 PPls mGeneMANIA PPls_ @ STRING PPIs (high confidence) ft CORUM PPIs 
x12 
2 
& 8 
re) 
8 4 
Z “eggusyygaggguavaurugren ra TLoeeT ye ERE S OY V OKI DEM TUR WOVEN NEY TS YOUET TS WEVS UGE RD ES TL POE w 
PSESSZEES SS Eu SSS g Sou es sag Bea S S885 SESS SES CER PRO Re SS OS SPREE SST BOs he rcgaa Posse S ea F 3s 
B x8ssoF8shs hoes Syn eod 4 ou S8Sa 2585 adgdSg Ss $28 8% fb oGa sk eet <F Bui, TOG LEE ge 
Q oe Fseg° 09 sou 8 K 30 Ba i | BESS gs o | ad oe * 
jor foe rs = SG 1 SS 8a: ae 1 
(7) * we q i * Bye i 1 
i igo 1 f 
Average ortholog 2,96. * 2,799 OS 1,858 | 1,389 | 1,985 
Average predicted PPls 12,720 | 13,329! | 9,030 ! 6,774 | 9,679 
ex: nD GE ee) LS ~—Ptsts ss or 
Total predicted PPIs (1,289,821) 445,206 346,559 Cnidarians 234,793 176,148 87,115 


Figure 2 | Derivation and projection of protein co-complex associations 
across taxa. a, Expanded coverage via experimental scale-up relative to our 
previous human study®. Chart shows number of proteins detected, most (63%) 
in two or more species. b, Performance benchmarks, measuring precision and 
recall of our method and data in identifying known co-complex interactions 
(annotated human complexes from CORUM”). Complexes were split into 
training and withheld test sets; fivefold cross-validation against 4,528 
interactions derived from the withheld test set shows strong performance gains, 
beyond baselines achieved using only co-fractionation or external evidence 
alone. TP, true positive; FP, false positive; FN, false negative. c, Plots showing 
high enrichment (probability ratio of interacting) of predicted interacting 
orthologous protein pairs (relative to non-interacting pairs) among highly 


many well-known and novel complexes linked to diverse biological 
processes (Extended Data Fig. 3b). The complexes have estimated 
component ages spanning from ~500 million (metazoan-specific, 
or ‘new’) to over one billion years (ancient, or ‘old’) of evolutionary 
divergence. Details of species, orthologues, taxonomic groups, protein 
ages and evolutionary distances are provided in Supplementary 
Tables 3 and 5 and Supplementary Methods. 

Although proteins arising in metazoa (by gene duplication or other 
means) account for about three quarters of all human gene products, 


correlated fractionation profiles, in both the holdout validation (test, T) and 
input species (colours reflect clade memberships). d, Left, representative 
co-fractionation data (normalized spectral counts shown for portions of 3 of 42 
experimental profiles) from human, fly and sea urchin showing characteristic 
profiles of proteasome core, base and lid sub-complexes. Hierarchical 
clustering (right) of pan-species pairwise Pearson correlation scores (centre) 
is consistent with accepted structural models (Protein Data Bank ID: 4CR2; 
core, red; base, blue; lid, green; out-clusters, white). e, Projection of conserved 
co-complex interactions across 122 eukaryotic species, indicating overlap with 
leading public PPI reference databases*”*'. STRING bars indicate excess 

over CORUM; GeneMANIA bars indicate excess over both; component and 
interaction occurrences across clades indicated at bottom. 


they form only about a third (39%; 147) of the clusters (Fig. 3a). These 
‘new’ complexes tend to be smaller (=3 components; Fig. 3b) and 
specific (components not present in ‘mixed’ complexes). This indi- 
cates that although protein number and diversity greatly increased 
with the rise of animals”, most stable protein complexes were inher- 
ited from the unicellular ancestor and subsequently modified slightly 
over time (Fig. 3c and Supplementary Table 5). Indeed, the dominant 
phylogenetic profile of complexes across Eukarya (Fig. 3d) is com- 
posed either entirely (344 old complexes) or predominantly (490 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 341 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a- a neo esiewbessees: b d TY2Y3Y4Y5Y6 
” ms GTF2F2 ca 
i H.s. 
Z t (Uecieie @—@SUPTSH EB 
3° = POLR2D O aaeaet RI 
@ POLR2J ey 
POLR2k & oy = 
SS POLR2L 
POLR2I ~~ \U=© popes = 
POLR2A 


Deuterostomes 


WDR36 

ve 
WDR3 

é 8 

TBLS ©~@ UTP15 
THOC6 — THOC3 += 
KR S.p. 
THOCS & Sp THOC2 
THOC1 © 8 tHoc7 


Protostomes 


@ SH3GLB1 
4g COMMDS © 


COMMD2 @& 


ccpc22 ¢ 
coMMD1 


KIAA1429 
® 


ZC3H13 @ | ®CBLL1 
6 


Cnidarians 


WTAP. 


Ps 
< 
ss Fungi 


3 
a (6) MTA2 @ go TADZB 
we 
ee. MTAI MBD2 
i) 
i u 
GATAD2A @ — CHD3 = 
MTAS @. 
D.d. 


\ New 
2 >» components 


‘ 2 
‘ ee 
e 


Metazoan (new) 


Metazoan 
(new) 


LE 
Fraction observed 


@ Old subunits @ New subunits 


Figure 3 | Prevalence of conservation of protein complexes across Metazoa 
and beyond. a, Conserved multiprotein complexes, identified by clustering, 
arranged according to average estimated component age (see Supplementary 
Methods and ref. 25). Proteins (nodes) classified as metazoan (green) or ancient 
(orange); assemblies showing divergent phylogenetic trajectories termed 
‘mixed’. b, Example complexes with different proportions of old and new 
subunits. c, Presumed origins of metazoan (new), mixed and old complexes; ‘? 
indicates variable origins of new genes. d, Heat map showing prevalence of 
selected complexes across phyla. Colour reflects fraction of components with 
detectable orthologues (absence, dark blue). Sea anemone (N. vectensis) is the 
most distant metazoan (cnidarian) analysed biochemically. 


mixed complexes) of ancient subunits ubiquitous among eukaryotes 
(Extended Data Fig. 4a; see Supplementary Table 5 for details), the 
latter presumably reflecting preferential accretion of additional com- 
ponents to pre-existing macromolecules (Fig. 3c)”*. 

These primordial complexes are present throughout the 
Opisthokonta supergroup (animals and fungi), estimated to be more 
than one billion years old’’, and plants (and presumably lost/signifi- 
cantly diverged among parasitic protists). Reflecting this central 
importance, these complexes tend strongly to be ubiquitously 
expressed throughout all cell types and tissues (Extended Data Fig. 
5a), are abundant (Extended Data Fig. 5b), and are enriched for 
associations to human disease and perturbation phenotypes in C. 
elegans (Supplementary Table 6). In comparison with other proteins 
in the 16,655 interactions, the older, conserved proteins present in 
these stable complexes have lower average domain complexity 


342 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


(P< 0.02; see Supplementary Methods), suggesting multi-domain 
architectures underlie more transient or tissue-specific interactions. 
Whereas mixed and old complexes are enriched for functional asso- 
ciations with core cellular processes, such as metabolism (Extended 
Data Fig. 4c), the strictly metazoan complexes were far more likely to 
be linked to cell adhesion, organization and differentiation, consist- 
ent with roles in multicellularity. Reflecting these different evolu- 
tionary trajectories, new clusters are substantially more enriched for 
cancer-related proteins (42%; 62/147; hypergeometric P = 1 X 10 °) 
compared to strictly old (15%; 53/344; P=1*X 10 3) clusters 
(Z-test < 0.0001) (Supplementary Table 7), have generally lower 
annotation rates (Extended Data Fig. 4b), and show different pre- 
ponderances of protein domains (Extended Data Fig. 4c and 
Supplementary Table 6). 


Independent biological assessment 


We used multiple approaches to assess the accuracy (Fig. 4) and 
functional significance (Fig. 5) of the predicted complexes. First, we 
performed affinity purification mass spectrometry (AP/MS) experi- 
ments on select novel complexes from the new, old and mixed age 
clusters, validating most associations in both worm and human 
(Fig. 4a and Extended Data Fig. 6a). We next performed a global 
validation by comparing our derived complexes to a newly reported 
large-scale AP/MS study of 23,756 putative human protein interac- 
tions detected in cell culture (E. L. Huttlin et al., BioGRID preprint 
166968), and observed a partial, but highly statistically significant, 
overlap to a degree comparable to literature-derived complexes 
(Fig. 4b, Extended Data Fig. 6b). 

Wealso observed broad agreement between the derived complexes’ 
inferred molecular weights (assuming 1:1 stoichiometries) and migra- 
tion by size-exclusion chromatography (Fig. 4c and Extended Data 
Fig. 7a) and density gradient centrifugation (Extended Data Fig. 7b). 
A prime example is the coherent profiles of a large (~500 kDa) 
mixed complex with several un-annotated components (Fig. 4d and 
Extended Data Fig. 8), dubbed ‘Commander’, because most 
subunits share COMM (copper metabolism MURR1) domains” impli- 
cated in copper toxicosis*’, among other roles*”*’. Commander con- 
tains coiled-coil domain proteins CCDC22 and CCDC93 (Figs 4a, d) 
in addition to ten COMM domain proteins, broadly supported 
by co-fractionation in human, fly and sea urchin (Extended Data 
Fig. 9a-c and supporting website, http://metazoa.med.utoronto.ca/ 
php/view_elution_image.php?id=71&cond=ms2). 

We found an unexpected role in embryonic development for 
Commander, whose subunits are strongly co-expressed in devel- 
oping frog (Extended Data Fig. 9d, e). COMMD2/3-knockdown 
(morpholino) tadpoles showed impaired head and eye development 
(Fig. 5a and Extended Data Fig. 9f, h), and defective neural pattern- 
ing and expression changes in brain markers PAX6, EN2 and 
KROX20/EGRI1 (Fig. 5b and Extended Data Fig. 9g, h). Given the 
recently discovered link**** between CCDC22 and human syn- 
dromes of intellectual disability, malformed cerebellum and craniofa- 
cial abnormalities, the deep conservation of the Commander complex 
suggests COMMD2/3 as strong candidates in the aetiology of these 
heterogeneous disorders. 

Among metazoan-specific protein complexes, we confirmed 
physical and functional associations of spindle checkpoint protein 
BUB3 with ZNF207, a zinc-finger protein conspicuously lacking 
orthologues in cnidarians and fungi. ZNF207 binds Bub3 via a 
Gle2-binding-sequence (GLEBS) motif restricted to deuterostomes 
and protostomes (Extended Data Fig. 10a). As in human, knockdown 
of the ZNF207 orthologue in C. elegans (B0035.1) enhanced lethality 
owing to impaired Bub3-mediated checkpoint arrest (Fig. 5c). 

Among mixed complexes, we confirmed metazoan-specific 
coiled-coil domain protein CCDC97 as a sub-stoichiometric com- 
ponent of human and worm SF3B spliceosomal complex involved in 
branch-site recognition (Fig. 4a). Consistent with a possible role in 


©2015 Macmillan Publishers Limited. All rights reserved 


SNRPA 


ARTICLE 


a BUB3 WDR3 CDCS5L SNAPAI CCDC93 VPS53 Z03H13 
e QuioPlRGI SFIB@ »@ SF3B1 IsT1 COMME ICONS ®\ concise @ 
TBL3 Pwp2  BCAS2 MIB2_ @ 
es & CCDC94 e DD PHESA @ eo KIAA1429@——@ WTAP 
e os e SF3B4 commD2®& @SH3GLB1 fe . 
ZNF207 UTP15 worse = BUD31 WPRPFIg SF3B14@ WG OCCDCI7 pF Bupa e y _ ee CBLL1 
SNW1 SF3B3 SF3B5 COMMDS5 
Human AP/MS CHMP4G: CCDC22 
ZNF207 WDR36 ccpc94 ccpcg97 ccpc22 ccDc132 ZC3H13 
BUB3 157 203 WDR36 414 478 | CCDC94 304 348 SF3B3 193 204 CCDC22 678 724 VPS53 41 48 ZC3H13 46 48 
ZNF207 19 26 TBL3 40 40 CDC5L 255 278 SF3B1 192 188 ccpcg3 49 49 |ccDC132 48 53 | KIAA1429 150 143 
WDR3 34 31 PLRG1 94 106 ccDc97 133 150) Confirmed COMMD2 36 38 WTAP 58 66 
PWP2 24 27 CNRL1 69 71 SF3B5 11 10 year COMMD8 25 31 CBLLi 36 © 37 
SPF27 61 69 COMMD4 24 23 VPS53 
: COMMD3 20 19 VPS53 132 159 
B0035.1 (ZNF207) F13H8.2 (WDR3) ra R10D12.13 (CCDC97) commD1 17 16 |ccpc132 50 49 
BUB-3 (BUB3) 3434 F13H8.2 (WDR3) 65 «62 fara ( TOBA11.2 (SF3B1) 11 11} COMMD9 10 11 
B0035.1 (ZNF207) 17.17 | _Y53C12B.1 (TBL3) 43 a ine not TEG-4 (SF3B3) 11 7 | 
F55F8.3 (PWP2) 31 ; | R10D12.13 (CCDC97) 7 7 Bait not Bait not 
Y45F10D.7 (WDR36) 1618 ernleaIe detected Neforthelogte detected Nosortholague 
Y23H5B.5 (UTP15) 9 6 | 
Worm AP/MS ° 2° : Ss See 
b c kDa 6 bia § id d 1.0 kDa © 3 = 6 a 1.0 
2 500 0.5 MW.< 100 kDa — COMMD1 
zS§ ee ie fj wee — COMMD2 
<2 4 wv tl v vw 303 63 —COMMD3 Jog 
0.05 107 0.06 0.13 ~ 0.05 OF St : — COMMD4 
= 2 ; — COMMD5 
z= 500 ue ee j we 85 100 kDa < MW < 700 kDa 06 — COMMDE Jo 5 
Oz L ii ae : — COMMD7 
Oe OG SOT ae “O08 am Bos = COMMS 
: : ? : i : — COMMD10 
2 2 500 5 04 0.4 —ccpe22 04 
59 seek ee seek 2 \ — CCDC93 
es OL + ry PF Neg 
fe] 0 Vv L Vv Vv { — SH3GLB1 
oF 0 32 70.19 031 ~00t 003 #95 m0Kpas Mv \, A 
Ey 0.2 0.2 
2 Zz 03 \ /\ y, \) t 
S + 500 y 
Su se seek see 64 RGA\ (YRS) 
G8 4 v2 Vv q Vv . 0.0 BLN NX x 0.0 
6 0 122 0.17 0.34 0.03 0.12 10 15 20 2 30 35 40 10 15 20 25 30 35 40 
Overlap Sensitivity Max. matching ratio Size-exclusion column fraction Size-exclusion column fraction 


Figure 4 | Physical validation of complexes. a, Verification of complexes 
from tagged human cell lines and transgenic worms (see Supplementary 
Methods; complexes drawn as in Fig. 3). Inset reports spectral counts obtained 
in replicate AP/MS analyses of indicated bait protein (header). MIB2-VPS4 
complex confirmed by co-immunoprecipitation (co-IP; Extended Data Fig. 6a). 
b, Conserved complexes significantly overlap large-scale AP/MS data reported 
for human cell lines (E. L. Huttlin et al., BioGRID preprint 166968) to a 


pre-mRNA splicing, CRISPR-based CCDC97-knockout human cells 
were slower growing than were control lines (Extended Data Fig. 
10b, c) and hypersensitive to pladienolide B (Fig. 5d), a macrolide 
inhibitor of SF3b**. 


a b Control MO Control MO 
Control —— > , | 
S- ". = ae 
» Ww 
z 
COMMD2 Al 
MO(ATG) ee 
S 
9 
COMMD3 . — ; & 
MO(ATG) —> 


COMMD2 COMMD3 
MO(ATG) — MO(ATG) 


Control 


5 15 


Ss 
e =s0 44.6 d 120 
ES 3g -= CCDC97 CRISPR-1 
& bay $2 80 -« CCDC97 CRISPR-6 
625 . zo ~~ Scramble CRISPR 
2 o 
5 11.0 8 2 40 
£ , 4.0 ft ] as 
5 HT115 B0035.1 bub-3 bub-3 + 0 "01° 03° 4 3° 10 100 
(control) (ZNF207) (BUB3) B0035.1 PB (nM) 
RNAi target f 
e 29, P< 0.013 
: P< 0.022 2 Typical Channelli 
2 Gane! Reactome signaling in +1 Human central § ypica pl 
2 pathways metabolism 8 A A 
s 1:5 5 B [ n 
Dos =z 
Ee Soy ©; 1G 
551.0 5 5 (Ls 
ed 
5 = P< 042 | 0 gor aaa 5 c Ge wel 
<0, , ° 
S 0.5 nine nn+3 P<1 P< 087 Pei i) | need + 
= and nn+3 $ a 
c. n.n+4 Random and Random & 
7 caf [1 _ shuffle nn+4 shuffle 
g 1,0 2p, Gly N1-formyl- Glu HCO, 
Gin Glu ATPADP +P, THF THF ATP ADP+P, ATP ADP+P, ATP ADP+P; 
prepP>% s-pra “4 car » 4 raaR* + Foeam +4 AiR SA car 
1 1 1 ah ob ' 
PPAT GART (1) GART(2) PFAS 'GART (3) PAICS (1) Asp 
PFAS rr PAICS (2) -- 
GART N10-formyl- \apP 
PAICS H,O THF THF Fumarate +P; 
ane IMP S— FAICAR S AICAR = SAICAR 
1 1 r 
Fractions ATIC (2) ATIC (1) ADSL 


comparable extent as literature reference sets*’’’, using three measures of 
complex-level agreement (see Supplementary Methods, Extended Data Fig. 
6b); ***P < 0.001, determined by shuffling (grey distributions). c, Agreement 
of inferred molecular weights (MW) of human protein complexes with size- 
exclusion chromatography profiles (data in ¢, d, from ref. 43). d, Co-elution of 
human Commander complex subunits by size-exclusion chromatography 
consistent with an approximately 500-kDa particle. 


Network perspective into conserved biological systems 


Knowledge of conserved macromolecular associations provides a 
road map for additional functional inferences. For instance, fractiona- 
tion profiles can be compared for any pair of proteins in our data set to 
search for evidence of interactions. We found significant enrichment 
for interactions among pairs of human proteins acting sequentially in 
annotated pathways” (Fig. 5e), especially G-protein and MAP-kinase 
cascades (Supplementary Table 8). Enzymes acting consecutively in 
core metabolic reactions (Fig. 5f) also showed a higher tendency to 
interact (Supplementary Table 8), the significance of which decayed 
with more intervening steps (Fig. 5e). For example, strong consecutive 


Figure 5 | Functional validation of complexes. a, Morpholino (MO(ATG), 
targeting start codon to block translation) knockdown of COMMD2 (n = 55 
animals, 2 clutches, 1 eye each) or COMMD3 (n = 64) in X. laevis embryos 
causes defective head and eye development (control n = 57; Extended Data 
Fig. 9f, h). ***P < 0.0001, 2-sided Mann-Whitney test. b, COMMD2/3 
knockdown animals (five embryos per treatment examined) show altered 
neural patterning, including posterior shift or loss of expression of mid-brain 
marker EN2 and KROX20 (EGR1), the latter in rhombomeres R3/R5 
(compare to Extended Data Fig. 9g, h). c, Enhanced embryonic lethality 
(epistasis) following RNAi knockdown in C. elegans of B0035.1 (ZNF207) 
and bub-3 together (eggs laid: HT115, 1,308; B0035.1, 1,096; bub-3, 445; bub- 
3 + B0035.1, 341). d, Enhanced sensitivity (mean + s.d. across four cell culture 
experiments) of two independent CCDC97-knockout lines to the SF3b 
inhibitor pladienolide B (PB) relative to control HEK293 cells. e, Enrichment 
(permutation test P value) for interactions among sequential pathway 
components and metabolic enzymes relative to shuffled controls (1 refers 

to enzyme index, where n, n + 1 denotes sequential enzymes, n, n + 2 
sequential-but-one, and so on, as described in Supplementary Information. 

f, Metabolic channelling as opposed to traditional (typical) two-step cascade 
model. g, Conserved interactions among consecutively acting enzymes 
involved in purine biosynthesis (two representative co-fractionation profiles of 
the 69 total generated are shown). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 343 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


interactions were apparent within the widely conserved purine bio- 
synthetic pathway, with enzymes (for example, PAICS, GART) elut- 
ing in two peaks (Fig. 5g), one coincident with the prior enzyme and 
the second with the downstream enzyme, suggestive of substrate 
channelling’. 

Despite the diversity of multicellular organisms, our study reveals 
fundamental attributes of the macromolecular machinery of animal 
cells with near universal pertinence to metazoan biology, develop- 
ment and evolution. Our extremely large set of supporting biochem- 
ical fractionation data (via ProteomeXchange with identifiers 
PXD002319-PXD002328), PPIs (via BioGRID; http://thebiogrid. 
org/185267/publication/) and interaction network projections are 
fully accessible (http://metazoa.med.utoronto.ca) to facilitate in- 
depth exploration. Although we focused on global conservation 
properties, these data can be analysed at the individual animal species 
or complex levels to assess the variety and functional adaptations of 
particular protein assemblies across phyla. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 15 December 2014; accepted 30 June 2015. 
Published online 7 September 2015. 


1. Hartwell, L. H., Hopfield, J. J., Leibler, S.& Murray, A. W. From molecular to modular 
cell biology. Nature 402, C47-C52 (1999). 

2. Alberts, B. The cell as a collection of protein machines: Preparing the next 
generation of molecular biologists. Ce// 92, 291-294 (1998). 

3. Butland, G. et al. Interaction network containing conserved and essential protein 
complexes in Escherichia coli. Nature 433, 531-537 (2005). 

4. Krogan, N. J. et a/. Global landscape of protein complexes in the yeast 
Saccharomyces cerevisiae. Nature 440, 637-643 (2006). 

5. Guruharsha, K. G. etal. A protein complex network of Drosophila melanogaster. Cell 
147, 690-703 (2011). 

6. Havugimana, P. C. et al. A census of human soluble protein complexes. Ce// 150, 
1068-1081 (2012). 

7. Stelzl, U. et al. A human protein-protein interaction network: a resource for 
annotating the proteome. Cel/ 122, 957-968 (2005). 

8. Li, S. etal. A map of the interactome network of the metazoan C-elegans. Science 
303, 540-543 (2004). 

9. Hu, P. et al. Global functional atlas of Escherichia coli encompassing previously 
uncharacterized proteins. PLoS Biol. 7, e1000096 (2009). 

10. Rolland, T. et al. A proteome-scale map of the human interactome network. Cell 
159, 1212-1226 (2014). 

11. Sharan, R. etal. Conserved patterns of protein interaction in multiple species. Proc. 
Natl Acad. Sci. USA 102, 1974-1979 (2005). 

12. Gandhi, T. K. B. etal. Analysis of the human protein interactome and comparison 
with yeast, worm and fly interaction datasets. Nature Genet. 38, 285-293 (2006). 

13. Tan, K., Shlomi, T., Feizi, H., Ideker, T. & Sharan, R. Transcriptional regulation of 
protein complexes within and across species. Proc. Nat! Acad. Sci. USA 104, 
1283-1288 (2007). 

14. Singh, R., Xu, J. B. & Berger, B. Global alignment of multiple protein interaction 
networks with application to functional orthology detection. Proc. Nat! Acad. Sci. 
USA 105, 12763-12768 (2008). 

15. Yu, H. etal. Annotation transfer between genomes: protein-protein interologs and 
protein-DNA regulogs. Genome Res. 14, 1107-1118 (2004). 

16. Ideker, T. & Krogan, N. J. Differential network biology. Mol. Syst. Biol. 8, 565 (2012). 

17. Kiemer, L. & Cesareni, G. Comparative interactomics: comparing apples and 
pears? Trends Biotechnol. 25, 448-454 (2007). 

18. von Mering, C. et al. Comparative assessment of large-scale data sets of protein— 
protein interactions. Nature 417, 399-403 (2002). 

19. Malovannaya, A. et al. Analysis of the human endogenous coregulator 
complexome. Cel/ 145, 787-799 (2011). 

20. Lee, |. Blom, U. M., Wang, P. I., Shim, J. E. & Marcotte, E. M. Prioritizing candidate 
disease genes by network-based boosting of genome-wide association data. 
Genome Res. 21, 1109-1121 (2011). 

21. Uhlen, M. et a/. Towards a knowledge-based Human Protein Atlas. Nature 
Biotechnol. 28, 1248-1250 (2010). 

22. Mckusick, V. A. Mendelian Inheritance in Man: A Catalog of Human Genes and 
Genetic Disorders. Johns Hopkins Univ. Press, 1998). 


344 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


23. Kim,M.S. etal. A draft map of the human proteome. Nature 509, 575-581 (2014). 

24. Rubin, G. M. et a/. Comparative genomics of the eukaryotes. Science 287, 
2204-2215 (2000). 

25. Bezginov,A,, Clark, G. W., Charlebois, R. L., Dar, V.U.N. & Tillier, E.R. M. Coevolution 
reveals a network of human proteins originating with multicellularity. Mol. Biol. 
Evol. 30, 332-346 (2013). 

26. Stumpf, M. P. H. et al. Estimating the size of the human interactome. Proc. Nat! 
Acad. Sci. USA 105, 6959-6964 (2008). 

27. Hart, G. T., Ramani, A. K. & Marcotte, E.M. How complete are current yeast and 
human protein-interaction networks? Genome Biol. 7, 120 (2006). 

28. Eisenberg, E. & Levanon, E. Y. Preferential attachment in the protein network 
evolution. Phys. Rev. Lett 91, 138701 (2003). 

29. Knoll, A. H. The early evolution of eukaryotes: a geological perspective. Science 
256, 622-627 (1992). 

30. Burstein, E. et a. COMMD proteins, a novel family of structural and functional 
homologs of MURR1. J. Biol. Chem. 280, 22222-22232 (2005). 

31. van de Sluis, B., Rothuizen, J., Pearson, P. L, van Oost, B. A. & Wijmenga, C. 
Identification of a new copper metabolism gene by positional cloning in a 
purebred dog population. Hum. Mol. Genet. 11, 165-173 (2002). 

32. McDonald, F. J. COMMD1 and ion transport proteins: what is the COMMection? 
Focus on “COMMD1 interacts with the COOH terminus of NKCC1 in Calu-3 airway 
epithelial cells to modulate NKCC1 ubiquitination”. Am. J. Physiol. Cell Physiol. 305, 
C€129-C130 (2013). 

33. Kolanczyk, M. et al. Missense variant in CCDC22 causes X-linked recessive 
intellectual disability with features of Ritscher-Schinzel/3C syndrome. Eur. J. Hum. 
Genet. 109, 1-6 (2014). 

34. Voineagu, |. et al CCDC22: a novel candidate gene for syndromic X-linked 
intellectual disability. Mol. Psychiatry 17, 4-7 (2012). 

35. Toledo, C. M. etal. BuGZ is required for Bub3 stability, Bub1 kinetochore function, 
and chromosome alignment. Dev. Cell 28, 282-294 (2014). 

36. Kotake, Y. et al. Splicing factor SF3b as a target of the antitumor natural product 
pladienolide. Nature Chem. Biol. 3, 570-575 (2007). 

37. Croft, D. et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 42, 
D472-D477 (2014). 

38. Ovadi, J. Cell Architecture and Metabolite Channeling. (RG Landes Company, 1995). 

39. Ruepp, A. et al. CORUM: the comprehensive resource of mammalian protein 
complexes-2009. Nucleic Acids Res. 38, D497-D501 (2010). 

40. Warde-Farley, D. et a/. The GeneMANIA prediction server: biological network 
integration for gene prioritization and predicting gene function. Nucleic Acids Res. 
38, W214-W220 (2010). 

41. Franceschini, A. et al. STRING v9.1: protein-protein interaction networks, 
with increased coverage and integration. Nucleic Acids Res. 41, D808-D815 
(2013). 

42. Pu,S., Wong, J., Turner, B., Cho, E. & Wodak, S. J. Up-to-date catalogues of yeast 
protein complexes. Nucleic Acids Res. 37, 825-831 (2009). 

43. Kirkwood, K. J., Ahmad, Y., Larance, M. & Lamond, A. |. Characterization of native 
protein complexes and protein isoform variation using size-fractionation-based 
quantitative proteomics. Mol. Cell. Proteomics 12, 3851-3873 (2013). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank G. Bader, P. Kim, G. Moreno-Hagelsieb, S. Pu and 

S. Wodak for critical suggestions, illustrator A. Syrett for expert help drafting figures, 
T. Kwon (University of Texas) for X. laevis gene models, and K. Foltz (University of 
California, Santa Barbara), A. Brehm (Philipps-University Marburg), P. Paddison (Fred 
Hutchinson Cancer Research Center), J. Smith (Woods Hole Marine Biological 
Laboratory), P. Zandstra and J. Moffat (University of Toronto) for providing biological 
specimens and reagents. We thank members of the Emili and Marcotte laboratories for 
assistance and guidance, and SciNet (University of Toronto) and the Texas Advanced 
Computing Center (University of Texas) for high-performance computing resources. 
This work was supported by grants from the CIHR, NSERC, ORF and the CFI to A.E., from 
the CIHR and Heart and Stroke to J. P., from the NIH (F32GM112495) to K.D., and 
from the NIH, NSF, CPRIT, and Welch Foundation (F-1515) to E.M.M. 


Author Contributions A.E. and E.M.M. designed and co-supervised the project. C.W. 
performed proteomic experiments, aided by P.C.H. B.B. coordinated data analysis, 
aided by S.Ph., K.D. and S.S., and guided by E.M.M. E.R.T., G.CI., A.B., J.P., X.X,, K.C., G.Cr., 
C.W. and S.Ph. analysed network and conservation data. C.W., F.T., O.K., J.K., S.Pa., O.P., 
Z.N., D.R.B., X.G., R.H.M, M.S., J.G., M.B., W.B.D. and J.B.W. contributed validation 
experiments. S.Ph. designed the web portal. C.W., B.B., EM.M. and A.E. drafted the 
manuscript. All authors discussed results and contributed edits. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to E.M.M. 
(marcotte@icmb.utexas.edu) or A.E. (andrew.emili@utoronto.ca). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 
1.0 
0.9 
0.8 
ao? = All data 
kre =. Fractionation data only 
-E 0.6+ — Only Human, Mouse, Urchin, Fly fractionations plus external data 
o — Only Human, Mouse, Urchin, Worm fractionations plus external data 
¢ 905 — Only Human, Urchin, Worm, Fly fractionations plus external data 
2 —— Only Human, Mouse, Worm, Fly fractionations plus external data 
® 0.4 Only Human, Mouse, Urchin fractionations plus external data 
cs Only Human, Mouse fractionations plus external data 
0.3+ 
0.2 
0.1 
OF : a 
0 0.1 0.2 0.3 0.4 
Recall: TP/(TP+FN) 
b Proportion of PPI across species c Novel PPI across species 
5 species, 5 species, 
763, 5% 114, 1% 
d 
e 
rs 
o 
= 0.6 
= 
ra 
F 
€ 
2 
% 0.4 
o 
o 
2 
a 
~1M scored interactions 
High-confidence PPIs 
Clustered PPls 


0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 
Recall: TP/(TP+FN) 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Extended Data Figure 1 | Performance measures. a, Performance 
benchmarks, measuring the precision and recall of our method and data in 
identifying known co-complex interactions from a withheld reference set 

of annotated human complexes (from CORUM”; as in Fig. 2b). Fivefold cross- 
validation against this withheld set shows strong performance gains, beyond a 
baseline achieved using only human and mouse co-fractionation data along 
with additional evidence from independent protein interaction screens””? 
and a functional gene network” (far-left curve), made by integrating 
co-fractionation data from the additional non-human animal species (as 
indicated). “All data’ and ‘Fractionation data only’ curves include biochemical 
fractionation data from all five input species: human, mouse, urchin, fly and 
worm; the latter curve omits all external data. In all cases, at least two 
species were required to show supporting biochemical evidence. Recall refers to 
the fraction of 4,528 total positive interactions derived from the withheld 
human CORUM complexes. b, All 16,655 interactions were identified at least 
in two species, half (49%, 8,121) found in three or more species. c, Among these 
high-confidence co-complex interactions, 8,981 (54%) were not reported in 
iRefWeb“ (v13.0), BioGRID* (v3.2.119) or CORUM reference 


44. Turner, B. et al. iRefWeb: interactive analysis of consolidated protein interaction 
data and their supporting evidence. Database 2010, baq023 (2010). 


(Supplementary Table 2) for any of the five input species or in yeast; half (46%, 
4,128) of these novel co-complex interactions display evidence of co- 
fractionation in three or more species. d, Final precision/recall performance on 
withheld interaction test set. A support vector machine classifier was trained 
using interactions derived from our training set of CORUM complexes, then 
~1 million protein pairs found to co-elute in at least two of the five input 
species were scored by the classifier. Black curve shows precision and recall for 
ranked list of co-eluting pairs, with recall representing the fraction recovered 
of 4,528 total positive interactions derived from the withheld set of merged 
human CORUM complexes, and precision measured using co-eluting pairs 
where both members of the pair are contained in the set of proteins represented 
in the CORUM withheld set. The top 16,655 pairs, giving a cumulative 
precision of 67.5% and recall of 23.0% on this withheld test set, form the high- 
confidence set of co-complex protein-protein interactions (blue circle). 

The highest-scoring interactions were clustered using the two-stage 
approach described in the Supplementary Methods, yielding a final set of 
7,669 interactions, which form the 981 identified complexes (red circle; 
precision = 90.0%, recall = 20.8%). 


45. Stark, C. et al. BioGRID: a general repository for interaction datasets. Nucleic Acids 
Res. 34, D535-D539 (2006). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


= ” . = = & oe of Sc Dd 
“| @ Predicted PPI 
@ Random Hs 
Mm 
Pal 
8 ad 1.00 
$ 0.75 
3 Nv 0.50 
© 0.25 
ue 0.00 
& 
s 
® 
wc 
“1.0 05 0.0 0.5 1.0 
Correlation 

c 


Distance between subunits (A) 


(0.0,0.2] (0.2,0.4] (0.4,0.6] (0.6,0.8] (0.8, 1.0) 
Correlation Coefficent 


Extended Data Figure 2 | Properties of protein elution profiles. 

a, Distribution of global protein tissue expression pattern similarity, measured 
as the Pearson correlation coefficient of protein abundance across 30 human 
tissues*’, showing markedly higher correlations for 16,468 protein-protein 
pairs of putative co-complex interaction partners compared to the same 
number of randomized pairs of proteins in the network which were not 
predicted to interact. b, Heat map illustrating the low to moderate cross-species 
Spearman’s rank correlation coefficients in the elution profiles observed 
between orthologous proteins during mixed-bed ion exchange 


chromatography under standardized conditions, highlighting the shift in 
absolute chromatographic retention times in different species. This variation 
indicates that the conservation of co-fractionation by putatively interacting 
proteins is not merely a trivial result stemming from fixed column-retention 
times. c, The degree of co-fractionation is measured as the correlation 
coefficient between elution profiles. Spatial proximity is calculated from the 
mean of residue pair distances between components of multisubunit 
complexes with known three-dimensional structures (see Supplementary 
Methods). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


("J Metazoan conserved complexes (this work) 


10? 


10° 


Fraction of total proteins in the set 


10% 


4 {J Corum Merged 
10 C— Corum Unmerged 


0 10 20 30 40 50 60 70 
Number of complexes participated in 


SHEAR HHS CL Rhe sg PAKE 
ORF SEERA CHD STEP OY KH 
HEPBPAK ESRI KESRSF AGERE 
SERS DH Bop RAGBQSARYOSIRKS 
BOK YE PEKAKMO Prd KaO® 
{AG lege ws, PEROT REMY 
POLS RE WUXEK AHH AS WHEY 
SOFT F ELROD (OP RIO FHCUG HOF 
BOBH SHH LAY eX LKLAPXAY 
xe bX RPOKHAGPFANY dvr TFL AQHA 
AA xX LARLLAPOLXADXSBRO BH X 
AAYT FPBSSAASGIFBAHGPAAIBFQAAY 
II TYE IRE PE BY /— GAA LY AY 
SES ee ee Sp eee er ey wee Pk 
(RBG Ae") 1808 oe CORE 4 O 9 
PP PSE OS SPR Ae fo SERS Shy 
SSP SPS PSS PSR PSN P QPS NP PR AIR 
mS TS PV ENS a PEN PRES PE NS OS A SN 
SN PRE QQ PRPS PPE NS (PPPS OREN PRP SPP 
[PP INNNNPPINNAVIPENRUNPRNIPNGY]Y 


LLLP UNTIL IENIINIV ISR NV OO NO es 


PEEEEP PUPP PET EE PUP EP EEE EET EEE EEE eee 
PEPE EP EEE EP ETP PEEP EP EEE E PEPE EEE 
PEPEPU ETE ET EPT EET EEE ETE EEE eee 
PEPE EP ETE EP ET PP EEE EEE ETE Eee 
PPEEPE ETP E TET UP E EE PEPEPEEPEEEeeeed 
PEPEEP EEE ETP PP PETE EP EEE EEE Eee 
PEPE P ETE ET EEE PED EEP TEP EE EET EEE eee 
PEPE PE PTE EP ET EEE PEEP Eee 


Extended Data Figure 3 | Derivation of complexes. a, The 2,153 proteins 
present in the 981 derived metazoan complexes participate in multiple 
assemblies (‘moonlighting’) to an extent comparable to the sharing of subunits 
reported for literature-derived complexes (CORUM). For comparison, we 
examined the 1,550 unique proteins from the full CORUM set of 1,216 human 
complexes passing our selection criteria for supporting evidence (‘Unmerged’) 
and the 1,461 unique proteins from the non-redundant set of 501 merged 
complexes used as the reference for splitting our training and testing sets, with 
some of the largest complexes removed to avoid bias in training (‘Merged’; 


see ‘Optimizing the two-stage clustering’ in Supplementary Methods for 
details). b, Schematic of 981 identified complexes containing 2,153 unique 
proteins. In this graphical representation, 7,669 co-complex interactions are 
shown as lines, and proteins as nodes. Red and green interactions were 
previously annotated in CORUM. Red interactions were used in training the 
classifier and/or clustering procedure, while green interactions were held out 
for validation purposes. Grey interactions were not previously annotated 

in CORUM. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a b 
255 Mi keGc OM) REACTOME @ £.c.NumBeER 
08, MiNew 
Bold 2+ 
07 Old Proteins 
0.6 
é 
5 05 & 
= So 
6 Fo 
g 04 2 4 
8 5 Complexes PPI CORUM 
¥ 03 £0 
2 
2 
a _ 
0.2 % 5 
¢ 
vu 
0.1 <1 
: New Proteins 
1.55 
G* 
Inc | In PPI CORUM AP/MS r 
Ms (ge = 2 ’ 2 This study 
<—Thesudy study 
c 
METAZOAN z-score MIXED 
2.5 25 25 25 -25 ufo 
cytoskeleton organization 
transcription factor binding 
mitosis 
cytoskeleton 
lipid particle 
aging 
¢ smic vesicle 
chromosome segregation 
nucleus 
small molecule metabolism 
peroxisome 
protein folding 
Myosin 
Tropomyosin 
zf-RNPHF 
DUF3677 
Med11 
HNRNPA1 
efhand_Ca_insen 
RRM_1 
CH 
BRO1 
calreticulin 
Cpn10 
Enolase 
Sod_Cu 
TIM 
AhpC-TSA 
1-cysPrx_C 
Transketolase 
Ldh_1 
HIT 
Gp_dh 
PCMT 
LSM 
HSP70 
Extended Data Figure 4 | Properties of new and old proteins and database) as in ref. 25. b, Annotation rates (mean count of annotation terms per 
complexes. a, The 2,153 protein components in the conserved animal protein) of old and new proteins in the derived complexes and pairwise PPIs, 
complexes tend to be more ancient than the 2,301 proteins reported in the compared with proteins in the CORUM reference complex set. Old proteins 


CORUM reference complexes or in two recent large-scale protein interaction (defined by OMA) from the complexes generally exhibited higher annotation 
assays, based on either the 7,062 proteins found by affinity purification/mass __ rates than new proteins. c, Differential enrichment of old, mixed and 
spectrometry (AP/MS; E. L. Huttlin et al., BioGRID preprint 166968, http:// metazoan-specific protein complexes for functional annotations (select GO- 
thebiogrid.org/166968/publication/) or the 3,667 proteins analysed by yeast slim biological process terms shown, top) and protein domains (Pfam, 
two-hybrid assays (Y2H)'°. Ages are derived from OMA (Orthologous Matrix —_ bottom). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


1.5% 2.2% 


0.1% 


@ Expressed in all tissues 
44% wi Mixed 


@ Expressed in all tissues 
@ Mixed 

Tissue enhanced « Tissue enhanced 
@ Group enriched ™@ Group enriched 


& Tissue enriched & Tissue enriched 


= Not detected | Not detected 
13% 90.6% 
c d 

6301 Complexes (this work) ‘ O20 Complexes (this work) 
_. All proteins identified (this work) da ~— All proteins identified (this work) 

6s) CORUM (old) | a rr rs CORUM (old) 
— CORUM (all) Pou 0.15) CORUM (all) 

i EBI 27 human tissue mRNA y 5 —— PaxDb all human proteins 


Frequency 
Freauencv 


5 0 5 10 -15 -10 5 0 5 10 15 
mRNA abundance, log2 (FPKM) Protein abundance, log2 (PPM) 


+ r 


Extended Data Figure 5 | Abundance and expression trends for proteinsin _ proteome”*, compared to less than half (46%) of the 17,294 proteins in the 
complexes. Proteins within the identified complexes tend to be ubiquitously _ overall reference set (Z-test P< 0.001). c, d, The distributions of average 


expressed across human tissues. a, b, Pie charts show the proportions of mRNA (c, data from EBI accession E-MTAB-1733) and protein (d, data from 
proteins with varying tissue expression patterns, from a recently published PaxDb integrated data set, 9606-H.sapiens_whole_organism-integrated_data 
human tissue proteome map*’, comparing the full set of 20,258 human set) abundances for all proteins identified and those within complexes. 
proteins (a) with the 2,131 proteins within the identified complexes Evolutionarily old proteins (defined by OMA as described in ref. 25 and 

(b). Consistent with these observations, 91% of the protein components inthe —_ mentioned earlier) tend towards higher abundances, even for proteins in 
complexes were expressed in >15 tissues in data from a reference human reference complexes. 


46. Uhlen, M. et al. Tissue-based map of the human proteome. Science 347, 6220 
(2015). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


g g g g 

al a od ol 

% re. * y 

& i & 5 & 5 & 
2115 255 255 255 

laden ele | e=) [—-—| 
35 35 35 
a 55 55 55 
IP VPS4A <-vPs4B ee 
aVPS4A °° avPs4B ast) °° 
b 


Precision: TP/(TP+ FP) 


=e All data 
~All data, excluding yeast 


0.00 0.02 0.04 


0.06 0.08 0.10 0.12 


Recall: TP/(TP+ FN) 


Extended Data Figure 6 | Additional validation data. a, Confirmation of 
MIB2 interactions by co-immunoprecipitation. Extract (~10 mg protein) 
from cultured human HCT116 cells expressing Flag-tagged MIB2 or control 
(WT) cells was incubated with 100 yl anti-Flag M2 resin for 4h while gently 
rotating at 4°C. After extensive washing with RIPA buffer, co-purifying 
proteins bound to the beads were eluted by the addition of 25 ,] Laemmli 
loading buffer at 95 °C. Polypeptides were separated by SDS-PAGE and 
immunoblotted using Flag, VPS4A, VPS4B or IST1 antibodies as indicated 


(expanded gel images provided in Supplementary Information). b, Protein 
co-complex interactions reported in the CYC2008 yeast protein complex 
database” are reconstructed accurately from the co-fractionation data, 
regardless of whether the full set of co-fractionation plus external data are used 
to derive protein interactions (‘All data’, see also Fig. 4b) or if the external 
yeast data was specifically excluded from the analyses (“All data, 

excluding yeast’). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


0.5 M.W. < 100 kDa 


0.3 


100k Da < M.W. < 700 kDa 


700 kDa < M.W. 


Average normalized peptide count 


20 25 
SEC fraction 
b 
M.W. < 100 kDa 

” 

¢ 

8 

e 100k Da < M.W. < 700 kDa 

BS 

_ 

roe 

7] 

a 

v 

a 

g 

o 

> 

=< 

700 kDa < M.W. 
0 5 10 15. 20 
Sucrose fraction 

Extended Data Figure 7 | Agreement of derived complexes’ molecular for the derived complexes. b, Derived complexes’ inferred molecular 
weights with measurement by HPLC and density centrifugation. weights are broadly consistent with their components’ average cumulative 
a, CORUM reference complexes’ inferred molecular weights (MW) are ultracentrifugation profiles on a sucrose density gradient. Average profiles are 
consistent with their components’ average cumulative size-exclusion plotted for X. laevis orthologues, based on a preparation of haemoglobin- 
chromatograms. The molecular weight of each complex was calculated as the __ depleted heart and liver proteins separated on a 7-47% sucrose density 
sum of putative component molecular weights, assuming 1:1 stoichiometry. gradient, as described in the Supplementary Methods. 


Data from ref. 43 were analysed as in Fig. 4c and show a similar trend as 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


gerreraaeguese © 
Detheletlesht eee: Saute oe 
HOOOSsaGGs: “ede 
aerate 


Ee dite 
SP SHE aE sesh: 


ebeetenth 


on: 


efeth tho 


Se 


seeds: a Fhe 


to begh 


ae ee 


la han ames ieee eae ° 


lack Gene Ontology (GO) functional annotations, while 1,756 of 7,665 


Extended Data Figure 8 | Distribution of uncharacterized proteins and 


co-complex interactions are novel (light green) (not listed in iRefWeb curation 


database). 


novel interactions across the 981 derived complexes. Complexes were sorted 
by median age (defined by OMA). Among 2,153 unique proteins, 293 (red) 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


COMMD2 
SH3GLB1 


Hs_hekN_1108 


COMMD10 | 


COMMD3 
COMMDS5 
COMMD1 
CcDC22 

CCDC93 

COMMD4 
COMMD6 
COMMD9 


Control 


KROX20 


b 


COMMD2 
SH3GLB1 
COMMD10 
COMMD5 
COMMD1 
CccDC22 
CCDC93 
COMMD4 
COMMD8& 


Sp_1109_PF5 


a 


0 20 40 60 80 100 
Fractions 


c 


COMMD2 
COMMD10 
COMMD3 
ccDc22 
CCDC93 


Dm_1107_DSL 


| 


COMMD2 COMMD2 COMMD3 
MO(ATG) MO(Splice) MO(ATG) 
Control MO Control MO Control MO 


in situ 


mRNA abundance (% of max) 


Native transcript PCR 461 bp 


Morphant transcript PCR 378 bp 


eT 


—— COMMD1 
= COMMD2 
=» COMMDS5 
«=» SH3GLB1 


16 18 20 23 25 30 33 


2 8 9 10 12 
Stage of X. tropicalis embryonic development 


13,14 


Commd2 Commd2 
MOatg MOsp 


Con 


Commd?2 MOsp 


Pax6 area (107 mm’) 


Con Commd2 


83 174 


~2.6kb 8 


£2 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 9 | Properties of the Commander complex. The 
automatically derived 8 subunit Commander complex (Fig. 3b) was 
subsequently extended to 13 subunits (COMMDI1 to 10, CCDC22, CCDC93, 
and SH3GLB1) based on combined analysis of AP/MS (Fig. 4a), size-exclusion 
chromatograms” (Fig. 4d), published pairwise interactions*’*”*, and 
analysis of elution profiles of the remaining COMM-domain-containing 
proteins, as shown here. Example protein elution profiles are plotted for 
Commander complex subunits observed from: HEK293 cell nuclear extract 
(a); sea urchin embryonic (5 days post-fertilization) extract (b); and fly SL2 cell 
nuclear extract (c); each fractionated by heparin affinity chromatography. 

d, Co-expression of Commander complex subunits during embryonic 
development of X. tropicalis (plotting mean + s.d. of three clutches; data from 
ref. 49). e, Messenger RNA expression patterns of Commander complex 
subunits in stage 15 X. laevis embryos. Images show coordinated spatial 
expression in early vertebrate embryogenesis, as measured by in situ 
hybridization (three embryos examined). f, Knockdown of Commd2 induced 
marked head and eye defects in developing X. laevis. Top, Commad2 antisense 
knockdown significantly decreased eye size, shown for stage 38 tadpoles 


47. de Bie, P. et al. Characterization of COMMD protein-protein interactions in NF-«B 
signalling. Biochem. J. 398, 63-71 (2006). 

48. Phillips-Krawczak, C. A. et al. COMMD1 is linked to the WASH complex and 
regulates endosomal trafficking of the copper transporter ATP7A. Mol. Biol. Cell 26, 
91-103 (2015). 


ARTICLE 


(from three clutches; control n = 47 animals, one eye each; ***P < 0.0001, 
two-sided Mann-Whitney test); phenotypes were consistent between 
translation blocking (MOatg; n = 60) morpholino reagents, splice site blocking 
(MOsp; n = 50) morpholinos, and knockdowns of interaction partner 
Commd3 (see Fig. 5a). Bottom, Commd2-knockdown induced altered Pax6 
patterning in the embryonic eye (control n = 8 animals, two eyes each; MO 
n= 11). g, Commd2/3-knockdown animals show altered neural patterning. 
Changes in stage 15 X. laevis embryos, measured by in situ hybridization 
(assayed in duplicates; five embryos per treatment), seen upon knockdown but 
not on controls: the forebrain marker PAX6 was expanded, while the mid-brain 
marker EN2 was strongly reduced. Notably, while expression of KROX20/ 
EGR1 in rhombomere R3 was shifted posteriorly, expression in R5 was strongly 
reduced or entirely absent. Panels in Fig. 5b are reproduced from this figure 
and are directly comparable. h, Confirmation of splice-blocking Commd2 
morpholino activity. Images and schematic show the basis and results of RT- 
PCR and agarose gel electrophoresis obtained with the corresponding X. laevis 
knockdown tadpoles. 


49. Yanai, |., Peshkin, L., Jorgensen, P. & Kirschner, M. W. Mapping gene expression in 
two Xenopus species: evolutionary constraints and developmental flexibility. Dev. 
Cell 20, 483-496 (2011). 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a b ‘ 
H. sapiens )§) K N & ro Q 
M. musculus )§) K N & S$ $ 
X. laevis ¥5) K N - F 
S. purpuratus N Vv < gs gS 
D. melanogaster (TK R s e & 
C.elegans R | Ss ccpcs7 
N.vectensis - - - === 
S.cerevisiae - - - - - - = - = 5 ee ee eee ee ee eee 
itehion = =~ 2a ene eewenwneenes —— = - | 
B-ACTIN cr, 
Conservation 
4346357363 85**5736534474 3 
c 100 - 
— Scramble CRISPR 
ay —CCDC97 CRISPR-1 
—CCDC97 CRISPR-6 
=) 
x 
2 
® 
is) 
z 
; 
Days 
Extended Data Figure 10 | Supporting data for BUB3 and CCDC97 (expanded gel images provided in Supplementary Information). c, Loss of 
experiments. a, Sequence alignment showing conservation of ZNF207 GLEBS_ = CCDC97 impairs cell growth. Lines show growth curves of control versus 
domain. b, Targeted CRISPR/Cas9-induced knockout of CCDC97 in two knockout cell lines in two biological replicate assays. 


independent lines of human HEK293 cells, as verified by western blotting 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature14887 


The mechanism of DNA replication 
termination in vertebrates 


James M. Dewar', Magda Budzowska! & Johannes C. Walter? 


Eukaryotic DNA replication terminates when replisomes from adjacent replication origins converge. Termination involves 
local completion of DNA synthesis, decatenation of daughter molecules and replisome disassembly. Termination has been 
difficult to study because termination events are generally asynchronous and sequence nonspecific. To overcome these 
challenges, we paused converging replisomes with a site-specific barrier in Xenopus egg extracts. Upon removal of the 
barrier, forks underwent synchronous and site-specific termination, allowing mechanistic dissection of this process. We 
show that DNA synthesis does not slow detectably as forks approach each other, and that leading strands pass each other 
unhindered before undergoing ligation to downstream lagging strands. Dissociation of the replicative CMG helicase 
(comprising CDC45, MCM2-7 and GINS) occurs only after the final ligation step, and is not required for completion of 
DNA synthesis, strongly suggesting that converging CMGs pass one another and dissociate from double-stranded DNA. 
This termination mechanism allows rapid completion of DNA synthesis while avoiding premature replisome 


disassembly. 


DNA replication occurs in three broad stages: initiation, elongation and 
termination. Termination occurs when converging replication forks 
meet and involves at least four processes, not necessarily in the follow- 
ing order. First, the last stretch of parental DNA between forks is 
unwound (dissolution) and replisomes come into contact; second, 
any remaining gaps in the daughter strands are filled in and nascent 
strands are ligated (ligation); third, double-stranded (ds)DNA inter- 
twinings (that is, catenanes) are removed (decatenation); fourth, the 
replisome is disassembled. Despite decades of research on termination’, 
we know little about the order, mechanism and regulation of the above 
events, especially during eukaryotic chromosomal replication. 

Termination has been most extensively studied in the mammalian 
DNA tumour virus SV40 (ref. 2), where converging replication forks 
stall during termination’**. Dissolution during SV40 replication 
requires rotation of the entire fork to produce catenations behind 
the fork (pre-catenanes)**, which are resolved by topoisomerase 
(Topo) II (ref. 6), probably in a manner similar to how Topo IV 
functions during bacterial termination”®. The SV40 replicative 
helicase, large T antigen, dissociates from chromatin before dissolu- 
tion, but whether this is required for the completion of replication 
is unknown”. After dissolution, daughter strands retain gaps of 
~60 nucleotides", which are ultimately filled in by an unknown 
mechanism in parallel to decatenation”. 

Eukaryotic termination has also been investigated. Although con- 
vergent forks accumulate at certain replication pause sites in yeast 
cells lacking 5’-3' DNA helicases’**, it is unknown whether forks 
stall during unperturbed termination. Furthermore, Topo II is not 
required for dissolution in budding yeast'*'” or during vertebrate 
termination’*'’. Recent work shows that late in S phase, the eukary- 
otic replicative helicase CMG” is removed from chromatin by the 
ATPase p97 after ubiquitylation of MCM7 (by SCF”? in yeast)**°, 
While one study implied that DNA replication can go to completion 
in the absence of CMG unloading”, another reported that tracts of 
unreplicated DNA remain in the absence of this process”’. Given that 
mis-regulation of bacterial termination can readily trigger re-replication 


of DNA’*”’, a potent driver of genomic instability in mammalian cells”, 
a better understanding of eukaryotic termination is essential. 

Owing to stochastic origin firing””° and variable rates of replisome 
progression*'”’, the location and timing of eukaryotic termination is 
variable*”*’, making this process difficult to study. Here we report that 
Xenopus egg extracts can be used to induce synchronous and localized 
termination events. This approach has allowed us to identify and 
order key events underlying vertebrate termination. 


A system to study replication termination 


Our strategy was to stall forks on either side of a reversible replication 
fork barrier (Fig. 1a, panels i-iii), and subsequently disassemble the 
barrier to trigger localized and synchronous termination events 
(Fig. 1a, panel iv). The barrier that we employed consisted of an array 
of lac repressors (LacRs) bound to lac operators (JacOs)***, which can 
be disrupted by IPTG. We constructed p[lacO,¢], which contains 16 
tandem copies of lacO (490 base pairs (bp)). p[lacO,.] was incubated 
in nucleus-free Xenopus egg extract, which promotes sequence- 
nonspecific replication initiation on added DNA molecules, followed 
by a single, complete round of DNA synthesis via a mechanism that 
appears to reflect events in cells*®. To monitor replication, radioactive 
[x-**P]dATP was included in the reaction. When p[lacO,6] was repli- 
cated in the absence of LacR for ~5 min and then cut with XmnI 
(Fig. 1a, panel iii), a single linear species representing fully replicated 
daughter molecules was observed (Fig. 1c, lane 1). In contrast, in the 
presence of LacR, a slow-mobility product appeared (Fig. 1c, lane 4) 
that corresponds to a double-Y structure, as shown by 2D gel electro- 
phoresis (Extended Data Fig. 1a). To confirm that the double-Y 
resulted from fork stalling at the outer edges of the array, we separately 
monitored replication in the plasmid backbone and in the JacO array. 
In the presence of LacR, synthesis of the array was specifically delayed 
(Extended Data Fig. 1f). In contrast, LacR had no effect on replication 
of a plasmid lacking lacO sites (Extended Data Fig. le). These results 
indicate that replication forks stalled on both sides of the LacR array, 
consistent with previous findings****””. 


1Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115, USA. @Howard Hughes Medical Institute, Department of Biological Chemistry 


and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115, USA. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 345 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


w 2 igure model system to study replication 
2p AlwNI e° Figure 1 | A model system to study replicat 
(Garb! 7490 bp 5 min 4IPTG termination. a, Scheme to induce site-specific 
plac s¢) Nenana ‘ —_ termination. Key restriction sites are highlighted. 
, e = ey . , Schematic of the dissolution assay. c, p[lac 
: ag b, Schematic of the dissolut y. ¢, pllacOy, 
extracts (il) (il) Xmnl (iv) was incubated in buffer or LacR, then replicated 
in the presence of [x-2?P]dATP, before termination 
b c IPTG (at 5 min) . was induced by the addition of IPTG. To 
Xmnl digest ' : = Buffer LacR = measure dissolution, radiolabelled termination 
Dissolution : Ssesosueu se onsuausas . : . 
4IPTG Min] Ssgsgueeonwreseress intermediates were cut with Xmnl, separated on 
2 aac =a am = md DYs | ---- = 0 a native agarose gel, and analysed by autoradio- 
M : Sine 
‘ oi graphy. d, Schematic of the ligation assay. e, To 
5 | A — eT = BE measure ligation, replication intermediates were 
‘ Leftward ee = = . : : 
AlwNI digest aos Ligation z UD SOOO Gy: ONO 5h L000 cut with AlwNI and separated ona denaturing 
HPT = 4 agarose gel. f, Schematic of the decatenation assay. 
=. => = FLS| ————— ~reee2eeee2 .-3 ‘ . . 
~2,000 ~550 3,148 ws " g, To measure decatenation, replication 
i 7 . -2 : : P 
oat peck 7 fe intermediates were separated on a native agarose 
“12 gel. The additional copy of lane 10 highlights 
Depstenaton a3 ee See abe eee catenated termination intermediates. Cats, 
Undigested me EEE EET EE ET : 
a catenanes; CMs, circular monomers; DYs, 
HPT. 9s see8t 2 je le double-Y structures; FLS, full-length strands; Lins, 
Cee -« “8 sc-sc _ linears; LWS, leftward strands; n—n, nicked- 
Catenanes Circular = CMs] = - Ss nicked; n-sc, nicked-supercoiled; RWS, rightward 
monomers aco seen sc 


h -® Dissolution -*Ligation -*Decatenation 
120 


Completion (%) 


6 7 8 
Time (min) 


10 


We next addressed whether replication forks stalled by LacR could 
restart. When IPTG was added to double-Y structures 5 min after 
replication initiation, 90% were converted to unit-sized linear plasmid 
molecules within a further 1.5 min (Fig. 1c, lanes 5-10 and Fig. 1h, 
yellow circles). In the absence of IPTG, only 21% of double-Y mole- 
cules disappeared after 3 min (Fig. lc, lane 18). The conversion of 
double-Y molecules to linear species occurs when any remaining 
parental DNA holding daughter molecules together is unwound 
(Fig. 1b). This process, which we refer to as ‘dissolution’, represents 
a convenient means to measure the point at which converging repli- 
somes meet. Notably, the ATR-Chk1 pathway was not activated 
above background levels during this procedure (data not shown). 

After dissolution, nascent strands should undergo ligation. To 
detect the growth and ligation of nascent strands, we digested 
p[lacO,6] with AlwNI, which cuts the plasmid once, ~550 nucleotides 
(nt) from the rightward edge of the array and ~2,000 nt from its 
leftward edge (Fig. la, panel iii, and Fig. 1d), and we analysed the 
products on a denaturing gel. Before IPTG addition, discrete species 
of ~2,000 nt (Fig. le, lane 4) and ~550 nt (Extended Data Fig. 2a, 
lane 4) were observed. Upon IPTG addition, both bands grew 
heterogeneously (Fig. le and Extended Data Fig. 2a). Since all leading 
strands were immediately extended upon IPTG addition (Extended 
Data Fig. 2b, c), we infer that the heterogeneity observed resulted 
because growth of the lagging strand was delayed until ligation 
of an additional Okazaki fragment. Finally, the nascent strands 
increased abruptly to the full length of 3,100 nt as ligation to down- 
stream lagging strands occurred (Fig. le, lanes 9-13). As expected, 
dissolution preceded ligation, and there was an ~45 s delay between 
these two events (Fig. 1h). 

Another important event associated with termination is decatena- 
tion of daughter molecules'*. To measure this process, we analysed 
undigested replication products on native agarose gels (Fig. lf, g). 
Before addition of IPTG, when the array had not yet been 
duplicated, replication products migrated as a compact smear of 
high-molecular-weight @ structures (Fig. 1f and Fig. 1g, lane 4). 
Upon addition of IPTG, most @ structures were lost within 1 min, 
and they were successively converted into three types of dimeric cate- 
nanes described previously*'*”*: nicked-nicked, nicked-supercoiled, 


346 | NATURE | VOL 525 | 17 SEPTEMBER 2015 
©2015 Macmillan Publishers 


strands; sc—sc, supercoiled—supercoiled; kb, 
kilobase ladder, with the size of each band (in 
kilobases) labelled. h, Multiple dissolution, ligation 
and decatenation assays were quantified. 

Means = standard deviation (s.d.) are plotted 
(n=4). 


and supercoiled-supercoiled (Fig. 1g and Extended Data Fig. 3a). 
Nicked-nicked catenanes appeared first (Fig. 1g, lanes 7, 8), followed 
by nicked-supercoiled (lanes 8-10) and supercoiled—supercoiled 
(Fig. 1g, lanes 9-12). Supercoiling is the result of nucleosome assem- 
bly on closed circular DNA”. Finally, monomeric, supercoiled daugh- 
ter molecules accumulated (sc, Fig. 1g, lane 17) dependent on Topo II 
(Extended Data Fig. 3b-d) as seen in vivo'®'’. Topo II was not 
required for dissolution or ligation (Extended Data Fig. 3c, d), sug- 
gesting that these processes proceed independently of decatena- 
tion’®!”"°, Like ligation, decatenation began ~40s after dissolution, 
but progressed at a slower rate than ligation (Fig. 1h). The same 
intermediates were detected in the absence of LacR, but their order 
of appearance was not well defined (Extended Data Fig. 3e). 

Our results demonstrate that a reversible replication fork barrier 
allows induction of a synchronous and spatially defined termination 
event. They also show that soon after forks meet, as measured by 
dissolution, daughter molecules are quickly ligated and decatenated. 


Converging replication forks do not stall 


To test the proposal that replication forks slow down or stall during 
termination’*”, we quantified the rate of DNA synthesis as two repli- 
somes converged within the JacO array. To minimize the loss of 
synchrony among replisomes after IPTG addition, we used a 365-bp 
array containing only 12 copies of lacO, which was sufficient to prevent 
dissolution at the 5 min time point (Extended Data Fig. 4c). We repli- 
cated p[lacO,2] in the presence of LacR, added IPTG after 5 min, and 
examined subsequent replication within the array by cutting the plas- 
mid with AflIII and Pvull (Fig. 2a). The rate of DNA synthesis within 
the array was almost perfectly linear after IPTG addition (Fig. 2b, c) 
even as dissolution was underway. These data suggest that converging 
forks do not slow significantly before they meet. A similar conclusion 
was reached when radiolabelled nucleotides were added at the same 
time as IPTG and incorporation measured only during the final stage of 
replication on p[lacO,2] (Extended Data Fig. 5a-f) or p[lacO,.] 
(Extended Data Fig. 5g, h). Moreover, fork rates within the lacO array 
resembled those previously reported in the same egg extracts 
(Extended Data Fig. 5f). These results suggest that converging repli- 
somes do not undergo prolonged stalling. 


Limited. All rights reserved 


a Afllll Afllll Afilll Afllll 
0d 
A) 
Pvull Pvull Pvull Pvull 
— _ —_— 
+IPTG 
Qs 6s Catenanes Circular 
monomers 
2 
Cc 
oO 
5 
© Vector Vector Vector Vector (2,356 bp) 
= 
S Nn ~* Re ——— =— 
5 NS a — — 
= Array Array Array Array (667 bp) 
og Double Ys Double Ys Linears Linears 
b IPTG (at 5 min) 
a2 Cc -® Dissolution 
Buffer LacR = -® Vector synthesis 
Min] §=6§ 8S BEBRERSKS RSS “© Array synthesis 
= GHHKDKHKSCOOTSG 
8.0 
5 460 120 
9 a =-30 & 100 
> eee -2.0 § 80 
-15  € 6 
> 15 0 40 
@|DYs| eo § 
& 6 20 
Lins! ~—= 8 ~— eee ee 
ball @ -05 0 


5.0 5.5 6.0 6.5 
Time (min) 


Figure 2 | DNA synthesis does not stall during termination. a, Cartoon 
depicting the assay for lacO array synthesis. b, LacR block-IPTG release was 
performed on p[lacO,]. To measure synthesis within the array, termination 
intermediates were cut with AflIII and Pvull to liberate the array fragment from 
the vector. Cleaved products were separated by native gel electrophoresis. 
Different exposures of array and vector fragments are shown (see Methods). 
c, Array synthesis, vector synthesis and dissolution were quantified. 

Means = s.d. are plotted (” = 3). kb, kilobase ladder, with the size of each band 
(in kilobases) labelled. 


To evaluate further whether forks slow or stall upon encounter with 
a converging fork, we compared progression of leading strands into 
arrays containing 12 or 32 copies of lacO (Fig. 3a), in which the 
rightward fork should collide with a converging fork at the 6th or 
16th JacO repeats, respectively (Fig. 3a). If converging forks interfere 
with each other, the rightward leading strand should pause or stall 
near the 6th repeat in p[lacO,] but not in p[lacO32]. As expected, 
dissolution (Fig. 1b) happened much earlier on p[/acO,2] than on 
p[lacO3,] (Extended Data Fig. 6a-d). To monitor leading-strand pro- 
gression into the array with near-nucleotide resolution, DNA inter- 
mediates were purified, digested with the nicking enzyme Nt.BspQI, 
which released rightward leading strands (Fig. 3a), and separated on a 
denaturing polyacrylamide gel (Fig. 3b). Before IPTG addition, a 
discrete ladder of leading strands was seen (Fig. 3b, lanes 2, 14), in 
which the 3’ ends of leading strands stalled ~29-33 nt from each 
LacR molecule in the array. This ~30 nt gap probably corresponds 
to the footprint of the CMG complex***’. As shown in Fig. 3b (red 
lines) and quantified in Extended Data Fig. 6e, 78% of leading strands 
were stalled at the first three JacO sites, indicating that most repli- 
somes were blocked at the outer edges of the array. 

Upon addition of IPTG, extension of leading strands resumed 
immediately (Fig. 3b, lanes 3-11 and 15-23). Notably, there was no 
enhanced pausing near the 6th JacO repeat of the JacO, array versus 
the lacO3, array. By 5.67 min, most leading strands had extended beyond 
the 6th lacO repeat within both arrays (Fig. 3b, lanes 6 and 18, and 
Extended Data Fig. 2c). This was also true for the leftward leading strands 
(Extended Data Fig. 6f, g). Furthermore, leading strands were extended 
beyond the 6th JacO repeat in the JacO;2 and lacO3, arrays with similar 
kinetics (Fig. 3c, d). When leading strands were analysed on alkaline 
denaturing gels, we observed that all rightward and leftward leading 
strands passed the mid-point of the array by 6.25 min (Extended Data 
Fig. 2d, e), indicating that the converging leading strands were readily 
extended past each other. In summary, we failed to observe detectable 
slowing or pausing of DNA synthesis during termination, and conver- 
ging leading strands passed each other unhindered, implying that con- 
verging replisomes do not pause or stall significantly. 


ARTICLE 
a PlilacO,,,] 365 bp array 


1 
———— —————— 
NUBspal \G_2_3_4_5_6_7_8_9 101112% 
en) er ae \ 


Collisions 


—————— 


300 bp 


plac zc) 990 bp array 
——— 
aba NEES 78 STOTT TOT TaTOORT 
et le — 
> >>> 


Collisions 


300 bp 
b lacO,, 


IPTG (at 5 min) 


lacO3, 


IPTG (at 5 min) 


‘es 
- 
© 
=} 
a LacR 


Lane|-an+t+Wonaom 


c]JPOSRMRORAOR 
SO S210 OO OF HO IL 


GATC SI 6666656566666 


0] 131 Buffer 
_ 
pe) 
[e) 
po) 


oO 
= 
oO 


NOWWMMMNNOOODOO 


12] 
111 
10) 
9| | 
8| | 
7| 
| “¥ 
6| 
| > 
I 
4 es 2 
ye : us 
: t ‘ 
i e* * 
c Experiment A (above) d Experiment B (not shown) 
= 
$100 8 100 -® lacO,, 
BS 80 BS 20 -® lacOz, 
Se Se 
ZF 60 28 60 
2 2 
ES 40 ES 40 
a 3 
oS 20 a2 20 
=o Pio} 
8 ty) g 0 
§ 5 55 6 6.5 s 5 5.5 6 
Time (min) Time (min) 


Figure 3 | Leading strands pass each other unhindered during termination. 
a, Schematic of rightward leading strands arrested at 12 and 32X lacO arrays 
(lacO,2 and lacO3,, respectively), and the predicted point of fork collision 
upon IPTG addition. b, LacR block-IPTG release was performed on p[lacO,] 
and p[lacO3,]. Termination intermediates were digested with Nt.BspQI. 
Nascent strands were separated alongside a sequencing ladder (generated by 
primer JDO107, green arrow in a) on a denaturing polyacrylamide gel and 
visualized by autoradiography. The JacO sites of p[JacO),] are indicated in blue. 
Red, yellow and grey lines indicate stall products that were quantified 
(Extended Data Fig. 6e). c, Leading strands whose 3’ ends were located before 
lacO7 were quantified (see Methods) along with dissolution (Extended Data 
Fig. 6a—d). d, Experimental repeat of c. 


Lagging strand gaps are rapidly filled in 

During SV4O0 replication termination, gaps of ~60 nt persist after 
dissolution”. To determine whether the appearance of such gaps 
precedes the ligation step in our system, we mapped the 3’ ends of 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 347 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


the leftward leading strands and the 5’ ends of the rightward lagging 
strands during termination within lacO,, (Fig. 4a). To this end, we 
digested DNA intermediates with Nb.BbvCI or Nb.BtsI to release 
leading or lagging strands, respectively (Fig. 4a), and separated 
them on denaturing polyacrylamide gels. After IPTG addition, we 
detected a prominent leading-strand product beyond the 12th lacO 
repeat (species 274 in Fig. 4b; the 3’ and 5’ termini of all leading and 
lagging strand products, respectively, are mapped relative to the 
Nb.BtsI site) as seen also in Fig. 3b. The 3’ end of this species was 
located ~3 nt from the 5’ end of the most abundant lagging strand 
product of the converging fork (271, Fig. 4c). We observed many 
other, less prominent, leading-strand products (181-420, Fig. 4b), 
most of which mapped close to corresponding lagging strand pro- 
ducts (176-417, Fig. 4c). The results show that leading strands 
are generally extended to within ~3nt of the lagging strands 
(Fig. 4d). It is likely that leading strands immediately abut lagging 
strands and that the ~3 nt gap reflects imprecise mapping of lagging 
strands (see Methods). In conclusion, we observed no evidence of 
persistent gaps between leading and lagging strands during replica- 
tion termination. 


a 300 bp 20 bp 
-——, 
SFeeoawawoqnsaoddoyNF— LL eOOMSN 
Nb.Bts| T2345 6 7 8 9 OT TA Nb.BbvCl 
SENSED nnn SEES 
365 bp array 
Nb.BbvCl (leftward leading) c Nb.Btsl (rightward lagging) 


IPTG (at 5 min) IPTG (at 5 min) 


i 
i 


Leading 
Lagging « e 


Figure 4 | Leading strands abut lagging strands of the opposing replisome 
during termination. a, Cartoon illustrating the leading and lagging strands 
released by Nb.BtsI and Nb.BbvCI nicking enzymes. Primers JDO111 (purple 
arrow) and JDO110 (pink arrow) generated the sequencing ladders in b and 
c, respectively. b, LacR block-IPTG release was performed on p[lacO,]. 
Termination intermediates were digested with Nb.BbvCI to liberate leftward 
leading strands, which were separated alongside a sequencing ladder on a 
denaturing polyacrylamide gel and visualized by autoradiography. Prominent 
leading strand products are highlighted (green symbols), and their sizes, in 
nucleotides, measured relative to the Nb.BtslI site, are indicated. c, Same 
samples as in b were digested with Nb.BtsI to liberate rightward lagging strands. 
The size of prominent lagging strand products (orange symbols), measured 
relative to the Nb.BtsI site, are indicated. d, Schematic of the mapped 

leading (b) and lagging (c) strands. 


348 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


CMGs dissociate late during termination 


To determine when replisome components dissociate during ter- 
mination, we monitored MCM7, CDC45, Pole and RPA binding to 
a site flanking the JacO array using chromatin immunoprecipitation 
(ChIP) (FLK2 locus, Extended Data Fig. 7a). In parallel, we monitored 
dissolution, ligation and decatenation. Before IPTG addition, MCM7, 
CDC45, Pole and RPA were 4-8-fold enriched at the array in the 
presence of LacR compared to buffer (Extended Data Fig. 7b-e, 
5 min time point), demonstrating that the ChIP signal reflects repli- 
some stalling at the array. When IPTG was added at 5 min, MCM7, 
CDC45, RPA and Pole largely dissociated by 9 min, whereas in the 
absence of IPTG, they dissociated much more slowly (Extended Data 
Fig. 7b-e). RPA dissociation correlated well with ligation, as expected, 
since ligation marks the disappearance of any single-stranded 
(ss)DNA in the termination zone (Fig. 5a, compare red squares and 
blue circles). Notably, CDC45, MCM7 and Pole dissociated ~1.5 min 
after dissolution and ~0.5 min after RPA dissociation and ligation 
(Fig. 5a). A time course of ChIP at sequences adjacent to and within 
the array (Extended Data Fig. 7f-i) was consistent with MCM7, 
CDC45 and DNA Pole moving into the array and then back out after 
dissolution (Extended Data Fig. 7j). MCM7 and CDC45 also disso- 
ciated after dissolution during replication of plasmid DNA that lacked 
a lacO array (p[empty], Extended Data Fig. 8a, b). Although the delay 
between ligation and unloading of MCM7 and CDC45 was not readily 
detectable on this template (Extended Data Fig. 8b), this was not 
surprising, given the asynchrony of termination in this setting. 
Together, the data support a model in which CDC45 and MCM7 
dissociate late in termination, long after forks meet (dissolution) 
and shortly after ligation. 

If our model is correct, inhibiting CMG unloading should not affect 
dissolution or ligation. To test this, we inhibited ubiquitin signalling, 
which is required for chromatin dissociation of CMG**”>*". p[empty] 
was replicated in extracts that were incubated with vehicle or the de- 
ubiquitylating enzyme inhibitor ubiquitin-vinyl-sulfone (Ub-VS), 
which leads to the depletion of free ubiquitin**“’, and we performed 
MCM7 and CDC45 ChIP. As shown in Fig. 5b, c, Ub-VS substantially 
delayed MCM7 and CDC45 dissociation, and this effect was partially 
reversed by co-addition of free ubiquitin (Fig. 5b, c). The same inhib- 
itory effect of Ub- VS on CMG unloading was observed when plasmids 
were recovered from egg extract and blotted for MCM7 and CDC45 
(Extended Data Fig. 8c). This analysis also confirmed previous 
reports*** of MCM7 ubiquitylation during replication (Extended 
Data Fig. 8c). Importantly, dissolution, ligation and decatenation were 
not affected by Ub-VS (Fig. 5d, e and Extended Data Fig. 8f-i). We 
conclude that defective CMG unloading does not affect dissolution, 
ligation, or decatenation, strongly supporting our model that CMG 
unloading is a late event in replication termination. 


Discussion 


We present a novel approach to induce synchronous and site-specific 
replication termination. Using this system, we observe no slowing or 
pausing of DNA synthesis as forks converge (Fig. 5f, panels i, ii). 
Leading strands pass each other unhindered and immediately abut 
downstream lagging strands before undergoing ligation (Fig. 5f, 
panels iii-v). CMG remains associated with DNA after dissolution, 
and it is unloaded only after the leading strand of one fork is ligated to 
the lagging strand of the opposing fork (Fig. 5f, panel vii). Catenane 
removal is initiated at the same time as ligation (Fig. 5f, panels v, vi). In 
contrast to models of termination in which replication forks stall’**, 
our data imply that topological stress between replisomes is handled 
efficiently and that converging replisomes do not clash or that if they 
do, any remaining template DNA is immediately reeled into the 
stalled replisome for duplication (not shown). We previously showed 
that CMGs encircle the leading strand template at the replication 
fork”. Therefore, converging CMGs approach each other on opposite 
strands**’, which helps explain how they could pass each other. If a 


©2015 Macmillan Publishers Limited. All rights reserved 


Completion (%) 


ARTICLE 


Figure 5 | CMGs dissociate after dissolution 
and ligation. a, LacR block-IPTG release was 


-a- Dissolution 
-m- Ligation 
-m- Decatenation 


Dissociation (%) 
~» RPA 


(i) Forks converge 


followed by MCM7, CDC45, RPA and Pole ChIP at 
the indicated times after IPTG addition. Dissolu- 
tion, ligation and decatenation were measured 

in parallel. Means + s.d. are plotted (n = 3). 

b, p[empty] was replicated in extracts treated with 


vehicle, ubiquitin-vinyl sulfone (Ub-VS), or Ub-VS 
and free ubiquitin (Ub-VS + Ub). Dissociation 


of MCM7 was measured by ChIP (see Methods). 
Mean = s.d. is plotted (n = 3). c, Same as b but 
CDC45 dissociation was measured. d, e, In 


parallel to MCM7 and CDC45 dissociation 


(b, c), dissolution (d) and ligation (e) were 
measured. Mean = s.d. is plotted (n = 3). See 


Extended Data Fig. 8f-i for decatenation 


measurements and representative gels. f, New 
model of vertebrate replication termination. 


-> MCM7 CMG (Y 
Pol Y 
ucts ba | (ii) Dissolution without stalling 
Time (min) re 
€ 
b MCN7 dissociation c CDC45 dissociation 
<! 00 sl 00 
S 80 E20 7 * 26 —7 \_ 
5 60 5 60 Catenane 
3 3 (iii) Leading strand extension 
a 40 a4 (iv) CMG encircles dsDNA 
ae a OS 
coo S90 
4 6 8 10 4 6 8 10 J 
Time (min) Time (min) ~ 
‘i pee (v) Ligation 

Dissolution e Ligation ‘ ‘ 
¢1 00 7 100 (vi) Decatenation 
Ee Ea0 oO 
£ 60 S60 -@ jj 
3 —— — —— — —~ - 
a 40 340 Topo Il 
— £90 (vil) CMG unloads 
6 20 6 4 ae 
(S) 0 Oo (viii) Decatenation finishes 

2 4 6 8 10 2 4 6 8 10 
Time (min) Time (min) 
-* Vehicle ~* Ub-VS -® Ub-VS + Ub 


fork stalls (for example, at the ribosomal DNA locus’**), the same 
termination mechanism could still operate provided that the stalled 
fork remains stable until a converging fork arrives. We expect this to 
be the case, given our recent observation that a single fork stalled at a 
DNA interstrand cross-link does not collapse or lose its CMG com- 
plex**. We speculate that at telomeres the replisome simply runs off 
the chromosome end. 

Our observations that CMG dissociates after the ligation step 
(Fig. 5a), and that ligation is not affected when CMG unloading is 
impaired (Fig. 5b, c, e), strongly imply that CMG is unloaded from 
dsDNA. We propose that when CMG reaches the 5’ end of the 
opposing fork’s lagging strand, it passes over the ssDNA-dsDNA 
junction and keeps moving along dsDNA (Fig. 5f), as previously 
observed for purified MCM2-7 and CMG in vitro (see refs 23, 44 
but see also ref. 22). This scenario is appealing, as it would prevent 
CMG from interfering with ligation of the nascent strands. We pro- 
pose that CMG ubiquitylation and its removal by p97 (refs 24, 25) is 
triggered once CMG encircles dsDNA. Such a mechanism would help 
to avoid inappropriate CMG unloading from active replication forks, 
where CMG encircles ssDNA. Our results disagree with a recent 
report, which concluded that inhibition of CMG unloading prevents 
completion of DNA synthesis”. In contrast, another report that 
defective CMG unloading does not prevent cell cycle progression™* 
is consistent with our model. We recently reported that CMG can be 
unloaded from ssDNA when two replisomes collide with a DNA 
interstrand cross-link*’. However, this process involves a unique, 
BRCA1-dependent pathway that is not employed during termina- 
tion*?. In conclusion, the termination mechanism described here 
allows rapid completion of DNA synthesis while minimizing the pos- 
sibility of premature replisome disassembly. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 27 September 2014; accepted 1 July 2015. 
Published online 31 August 2015. 


1. Levene, A. J., Kang, H. S. & Billheimer, F.E. DNA replication in SV40 infected cells. |. 
Analysis of replicating SV40 DNA. J. Mol. Biol. 50, 549-568 (1970). 

2. Fanning, E. & Zhao, K. SV40 DNA replication: from the A gene to a nanomachine. 
Virology 384, 352-359 (2009). 

3. Tapper, D. P. & DePamphilis, M. L. Discontinuous DNA replication: Accumulation 
of simian virus 40 DNA at specific stages in its replication. J. Mol. Biol. 120, 
401-422 (1978). 

4. Seidman, M. M. & Salzman, N. P. Late replicative intermediates are accumulated 
during simian virus 40 DNA replication in vivo and in vitro. J. Virol. 30, 600-609 
(1979). 

5. Sundin, O. & Varshavsky, A. Terminal stages of SV40 DNA replication proceed via 


multiply intertwined catenated dimers. Ce// 21, 103-114 (1980). 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


Ishimi, Y., Sugasawa, K., Hanaoka, F., Eki, T. & Hurwitz, J. Topoisomerase II plays an 
essential role as a swivelase in the late stage of SV40 chromosome replication in 
vitro. J. Biol. Chem. 267, 462-466 (1992). 

Hiasa, H. & Marians, K. J. Two distinct modes of strand unlinking during theta-type 
DNA replication. J. Biol. Chem. 271, 21529-21535 (1996). 

Espeli, O., Levine, C., Hassing, H. & Marians, K. J. Temporal regulation of 
topoisomerase IV activity in E. coli. Mol. Cell 11, 189-201 (2003). 

Segawa, M., Sugano, S. & Yamaguchi, N. Association of simian virus 40 T antigen 
with replicating nucleoprotein complexes of simian virus 40. J. Virol. 35, 320-330 
(1980). 

Tack, L.C.& DePamphilis, M.L. Analysis of simian virus 40 chromosome-T-antigen 
complexes: T-antigen is preferentially associated with early replicating DNA 
intermediates. J. Virol. 48, 281-295 (1983). 

Chen, M. C., Birkenmeier, E. & Salzman, N. P. Simian virus 40 DNA 

replication: characterization of gaps in the termination region. J. Virol. 17, 
614-621 (1976). 

Sundin, O. & Varshavsky, A. Arrest of segregation leads to accumulation of highly 
intertwined catenated dimers: dissection of the final stages of SV40 DNA 
replication. Cel/ 25, 659-669 (1981). 

Ivessa, A. S., Zhou, J. Q. & Zakian, V. A. The Saccharomyces Piflp DNA helicase and 
the highly related Rrm3p have opposite effects on replication fork progression in 
ribosomal DNA. Cel! 100, 479-489 (2000). 

Steinacher, R., Osman, F., Dalgaard, J. Z., Lorenz, A. & Whitby, M. C. The DNA 
helicase Pfh1 promotes fork merging at replication termination sites to ensure 
genome stability. Genes Dev. 26, 594-602 (2012). 

Fachinetti, D. et a/. Replication termination at eukaryotic chromosomes is 
mediated by Top2 and occurs at genomic loci containing pausing elements. Mol. 
Cell 39, 595-605 (2010). 

DiNardo, S., Voelkel, K. & Sternglanz, R. DNA topoisomerase || mutant of 
Saccharomyces cerevisiae: topoisomerase Il is required for segregation of daughter 
molecules at the termination of DNA replication. Proc. Natl Acad. Sci. USA 81, 
2616-2620 (1984). 

Baxter, J. & Diffley, J. F. Topoisomerase II inactivation prevents the completion of 
DNA replication in budding yeast. Mol. Cell 30, 790-802 (2008). 

Lucas, |., Germe, T., Chevrier-Miller, M. & Hyrien, O. Topoisomerase I! can unlink 
replicating DNA by precatenane removal. EMBO J. 20, 6509-6519 (2001). 
Gaggioli, V., Le Viet, B., Germe, T. & Hyrien, O. DNA topoisomerase II controls 
replication origin cluster licensing and firing time in Xenopus egg extracts. Nucleic 
Acids Res. 41, 7313-7331 (2013). 

Ilves, |., Petojevic, T., Pesavento, J. J. & Botchan, M. R. Activation of the MCM2-7 
helicase by association with Cdc45 and GINS proteins. Mol. Cell 37, 247-258 
(2010). 

Pacek, M., Tutter, A. V., Kubota, Y., Takisawa, H. & Walter, J. C. Localization of 
MCM2-7, Cdc45, and GINS to the site of DNA unwinding during eukaryotic DNA 
replication. Mol. Cel! 21, 581-587 (2006). 

Moyer, S. E., Lewis, P. W. & Botchan, M. R. Isolation of the Cdc45/Mcm2-7/GINS 
(CMG) complex, a candidate for the eukaryotic DNA replication fork helicase. Proc. 
Natl Acad. Sci. USA 103, 10236-10241 (2006). 

Kang, Y.H., Galal, W. C., Farina, A., Tappin, |. & Hurwitz, J. Properties of the human 
Cdc45/Mcm2-7/GINS helicase complex and its action with DNA polymerase ¢ in 
rolling circle DNA synthesis. Proc. Natl. Acad. Sci. USA 109, 6042-6047 (2012). 
Maric, M., Maculins, T., De Piccoli, G. & Labib, K. Cdc48 and a ubiquitin ligase drive 
disassembly of the CMG helicase at the end of DNA replication. Science 346, 
1253596-1253596 (2014). 

Priego Moreno, S., Bailey, R., Campion, N., Herron, S. & Gambus, A. 
Polyubiquitylation drives replisome disassembly at the termination of DNA 
replication. Science 346, 477-481 (2014). 

Hiasa, H. & Marians, K. J. Tus prevents overreplication of oriC plasmid DNA. J. Biol. 
Chem. 269, 26959-26968 (1994). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 349 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


27: 


28. 
29. 


30. 


31. 
32. 


33. 
34. 
35. 
36. 
37. 


38. 


Rudolph, C. J., Upton, A. L., Stockum, A., Nieduszynski, C. A. & Lloyd, R. G. Avoiding 
chromosome pathology when replication forks collide. Nature 500, 608-611 
(2013). 

Hook, S. S., Lin, J. J. & Dutta, A. Mechanisms to control rereplication and 
implications for cancer. Curr. Opin. Cell Biol. 19, 663-671 (2007). 

Czajkowsky, D. M., Liu, J., Hamlin, J. L. & Shao, Z. DNA combing reveals intrinsic 
temporal disorder in the replication of yeast chromosome VI. J. Mol. Biol. 375, 
12-19 (2008). 

McGuffee, S.R., Smith, D. J. & Whitehouse, |. Quantitative, genome-wide analysis of 
eukaryotic replication initiation and termination. Mol. Cell 50, 123-135 (2013). 
Yardimci, H., Loveland, A. B., Habuchi, S., van Oijen, A. M. & Walter, J. C. Uncoupling 
of sister replisomes during eukaryotic DNA replication. Mol. Cell 40, 834-840 
(2010). 

Loveland, A. B., Habuchi, S., Walter, J.C. & van Oijen, A. M. A general approach to 
break the concentration barrier in single-molecule imaging. Nature Methods 9, 
987-992 (2012). 

Santamaria, D. et al. Bi-directional replication and random termination. Nucleic 
Acids Res. 28, 2099-2107 (2000). 

Zhang, J. et al. DNA interstrand cross-link repair requires replication fork 
convergence. Nature Struct. Mol. Biol. 22, 242-247 (2015). 

Duxin, J. P. P., Dewar, J. M. M., Yardimci, H. & Walter, J. C. C. Replication-coupled 
repair of a DNA-protein crosslink. Ce// 159, 346-357 (2014). 

Walter, J., Sun, L. & Newport, J. Regulated chromosomal DNA replication in the 
absence of a nucleus. Mol. Cell 1, 519-529 (1998). 

Sofueva, S. et al. Ultrafine anaphase bridges, broken DNA and illegitimate 
recombination induced by a replication fork barrier. Nucleic Acids Res. 39, 
6568-6584 (2011). 

Charbin, A., Bouchoux, C. & Uhlmann, F. Condensin aids sister chromatid 
decatenation by topoisomerase Il. Nucleic Acids Res. 42, 340-348 (2014). 


350 | NATURE | VOL 525 |17 SEPTEMBER 2015 
©2015 Macmillan Publishers Limited. All rights reserved 


39. 


40. 


41. 


42. 


43. 


44. 


Laskey, R.A. Mills, A. D. & Morris, N. R. Assembly of SV40 chromatin in a cell-free 
system from Xenopus eggs. Cell 10, 237-243 (1977). 

Raschle, M. et a/. Mechanism of replication-coupled DNA interstrand crosslink 
repair. Cell 134, 969-980 (2008). 
Long, D. T. T., Joukov, V., Budzowska, M. & Walter, J.C. C. BRCA1 promotes 
unloading of the CMG helicase from a stalled DNA replication fork. Mol. Cell 56, 
174-185 (2014). 
Fu, Y. V. et al. Selective bypass of a lagging strand roadblock by the eukaryotic 
replicative DNA helicase. Cell 146, 931-941 (2011). 
Costa, A. et al. The structural basis for MCM2-7 helicase activation by GINS and 
Cdc45. Nature Struct. Mol. Biol. 18, 471-477 (2011). 
Kaplan, D. L, Davey, M. J. & O’Donnell, M. Mcm4,6,7 uses a ‘pump in ring’ 
mechanism to unwind DNA by steric exclusion and actively translocate along a 
duplex. J. Biol. Chem. 278, 49171-49182 (2003). 


Acknowledgements We thank C. Richardson and members of the Walter laboratory for 


feedback on the manuscript. We thank K. J. 


Marians and J. T. Yeeles for plasmids and 


the Lacl purification protocol. J.C.W. was supported by NIH grants GM62267 and 
GM80676. J.C.W. is an investigator of the Howard Hughes Medical Institute. 


Author Contributions J.M.D. and J.C.W. designed the experiments. J.M.D. performed 
the experiments. M.B. developed methodologies for plasmid pull downs and HIS¢-Ub 
immunoprecipitations. J.M.D. and J.C.W. interpreted the data and wrote the paper. 


Author Information Reprints and permissions information is available at 


www.nature.com/reprints. The au 


hors declare no competing financial interests. 


Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to J.C.W. 
(johannes_walter@hms.harvard.edu). 


METHODS 


No statistical methods were used to predetermine sample size. 

Protein purification. Biotinylated LacR was purified using a protocol adapted 
from Kenneth Marian’s laboratory (personal communication). The LacR open 
reading frame was fused to a C-terminal AviTag (Avidity, Denver, CO) and 
expressed from pET1la (pET1la[LacR-Avi]). To biotinylate the AviTag on 
LacR-Avi, biotin ligase was co-expressed from pBirAcm (Avidity, Denver, CO). 
To this end, pET1la[LacR-Avi] and pBirAcm were co-transformed into T7 
Express cells (New England Biolabs) and grown in the presence of ampicillin 
(100 pg ml‘) and chloramphenicol (17 ug ml‘). Expression of LacR-Avi and 
the biotin ligase was induced by addition of IPTG to a final concentration of 
1mM. Cultures were supplemented with 501M biotin (Research Organics, 
Cleveland, OH) to ensure efficient biotinylation of LacR-Avi. 

Cell pellets were resuspended in lysis buffer (50 mM Tris-HCl, pH 7.5, 5mM 
EDTA, 100 mM NaCl, 1 mM DTT, 10% sucrose (w/v), Complete protease inhib- 
itor (Roche, Nutley, NJ)). The cells were lysed at room temperature in the pres- 
ence of 0.2mgml~' lysozyme and 0.1% Brij 58. The insoluble, chromatin- 
containing fraction was isolated by centrifugation at 4°C. Chromatin-bound 
LacR was then released by sonication (in 50mM Tris-HCl, pH 7.5, 5mM 
EDTA, 1M NaCl, 1mM DTT, Complete protease inhibitor, 30 mM IPTG). 
DNA was removed from the soluble fraction by addition of polymin P (final 
concentration 1%), LacR was precipitated by addition of ammonium sulfate (final 
concentration 37%). The precipitate was dissolved in wash buffer (50 mM Tris- 
HCl, pH 7.5, 1 mM EDTA, 2.5 M NaCl, 1 mM DTT, Complete protease inhibitor) 
and then applied to a column of SoftLink avidin resin (Promega, Madison, WI). 
LacR was eluted (in 50 mM Tris-HCl, pH 7.5, 1 mM EDTA, 100 mM NaCl, 1 mM 
DTT, 5mM biotin) and dialysed overnight (against 50 mM Tris-HCl, pH 7.5, 
1mM EDTA, 150 mM NaCl, 1 mM DTT, 38% glycerol (v/v)). Purified LacR was 
frozen in liquid nitrogen and stored at —80°C. A more detailed purification 
protocol is available on request. 

Cyclin A was purified as described previously’. 

Plasmid construction and preparation. pJD82 (Extended Data Table 1) was 
created by replacing the SacI-KpnI fragment of pBlueScript II KS- with the 
sequence: GAGCTCTCACACCTACAAGGGATGTACATCAATTGTGAGCG 
GATAACAATTGTTAGGGAGGAATTGTGAGCGGATAACAATTTGGAGT 
TGATAATTGTGAGCGGATAACAATTGGCTTCAACGTAATTGTGAGCGG 
ATAACAATTTCCGTACGAATGTGCCGAACTTATGGTACC. This contains 
four tandem repeats of the lac operator sequence (AATTGTGAGCGGATAA 
CAATT) interspersed by an average of 10-11 bp of random sequence (average 
10.33 bp). Additional tandem repeats of the BsiWI-BsrGI fragment were then 
cloned into pJD82, and subsequently derived vectors, to generate arrays of 8, 12, 
16, 32 and 48 lacO repeats (Extended Data Table 1). Recognition sites for nicking 
enzymes were introduced by QuickChange mutagenesis (Agilent Technologies, 
Santa Clara, CA) according to the manufacturer’s guidelines. 

To propagate lacO plasmid DNA, plasmids were transformed into DH5« cells 
and grown for a minimal number of passages in the presence of 2mM IPTG. 
DNA was prepared using the QIAprep spin kit (Qiagen, Valencia, CA). To 
eliminate preparations containing genetic rearrangements (typically ~25%), 
each preparation was separated by electrophoresis on a 0.8% agarose gel and 
visualized by ethidium bromide staining. Preparations that were free of rear- 
ranged plasmids were then verified by sequencing (Genewiz, Cambridge, MA). 
Xenopus egg extracts and DNA replication. Xenopus egg extracts were prepared 
from Xenopus laevis wild-type males and females 2-5 years of age, as approved by 
the Harvard Medical School Institutional Animal Care and use Committee 
(IACUC) and as described previously*®. For DNA replication, 1 volume of ‘licens- 
ing mix’ was prepared by adding plasmid DNA to High Speed Supernatant (HSS) 
of egg cytoplasm to a final concentration of 7.5-15 ng pl’. Licensing mix was 
incubated for 30 min at room temperature, leading to the formation of pre- 
replication complexes (pre-RCs). Next, licensing mix was supplemented with 
0.1 volumes of cyclin A to a final volume of 576 nM and incubated a further 
10 min at room temperature, as previously described’. Cyclin A treatment was 
performed to achieve highly synchronous DNA replication (Extended Data 
Fig. 9). Finally, 1.9 volumes of nucleoplasmic extract (NPE) was added to initiate 
Cdk2-dependent replication at pre-RCs. In all figures, ‘0 minutes’ represents the 
time 30 s after NPE addition. To radiolabel DNA, NPE was supplemented with 
[a-*’P]dATP. Reactions were stopped with 10 volumes Stop Solution (0.5% SDS, 
25 mM EDTA, 50 mM Tris-HCl pH 7.5). DNA in Stop Solution was treated with 
RNase A (190 ng ul”! final concentration) then Proteinase K (909 ng ul? final 
concentration) before either direct analysis by gel electrophoresis or purification 
of DNA as described previously*°. For Ub-VS experiments, Ub-VS (Boston 
Biochem) was added to final concentration of 20 11M, to HSS 5 min before addi- 
tion of plasmid DNA (HSS) and to NPE 5 min before addition of HSS, with or 


ARTICLE 


without 12011M ubiquitin (Boston Biochem). Unless otherwise stated in the 
figure legend, all experiments were performed at least twice and a representative 
result is shown. Replicate samples were collected from independently assembled 
replication reactions, and therefore represent biological replicates. 
Immunodepletions. To deplete Topo II-« from Xenopus egg extracts one volume 
of Protein A Sepharose Fast Flow (PAS) (GE Healthcare) was incubated with 
4.5 volumes of affinity purified, anti-Topo II-« antibody raised against the 
C-terminal 20 residues (1 mg ml~'). For mock depletion, an equivalent quantity 
of nonspecific IgGs was used. Five volumes of pre-cleared HSS or NPE was then 
mixed with one volume of the antibody-bound sepharose and incubated for 
45 min at 4 °C, and for the NPE this was repeated once. Depleted extracts were 
collected and used immediately for DNA replication. 

Induction of termination. To monitor termination, 0.05 volumes of plasmid 
DNA (150-300 ng ul?) was incubated with 0.1 volumes LacR (54 1M) or dialysis 
buffer for at least 90 min at room temperature to allow formation of LacR arrays 
on the DNA. Licensing mix was prepared by adding 0.85 volumes of HSS, and 
DNA was replicated as described above. To induce termination, 0.06 volumes of 
IPTG was added (to a final concentration of 10 mM) at the time indicated (typ- 
ically 5 min), which triggered dissociation of lacO-bound LacR. To accurately 
withdraw samples at the times indicated, reactions composed of the same 
Licensing Mix and NPE were staggered, where necessary. 

2D gel electrophoresis. 2D gels were performed as described"". Briefly, purified 
DNA was digested with XmnI (New England BioLabs) and then separated by 
native-native 2D gel electrophoresis. Samples were separated in the first dimen- 
sion on a 0.4% agarose gel at 0.75 volts (V) cm’ ' for approximately 40 h at room 
temperature. The gel was stained with 0.3 pg ml’ ethidium bromide, allowing 
the 2-8 kb size range to be excised. A second dimension gel containing 1% agarose 
and 0.31gml~’ ethidium bromide was cast over the gel slice from the first 
dimension. DNA was separated on the second dimension at 4.5Vcm ! for 
12hat4°C. 

Termination assays. To monitor dissolution, 0.25-1.0 ng pl of purified DNA 
was incubated in CutSmart Buffer with 0.4 units l~' of XmnI (New England 
BioLabs) at 37 °C for 1 h. Digested products were separated on a 1.2% agarose gel 
at 4Vcm | and detected by autoradiography. Dissolution (%) was calculated as 
the percentage of total signal in each lane present in the linear products of 
digestion (Lins, Fig. 1c). 

To monitor ligation, 0.25-1.0ngul | of purified DNA was incubated in 
CutSmart buffer with 0.2 units pl”! of AlwNI (New England BioLabs) at 37°C 
for 1h. Digests were terminated by addition of EDTA to 30 mM, then products 
were separated on a 1.5% denaturing alkaline agarose gel at 1.5V.cm ‘ and 
detected by autoradiography. The percentage of total signal in each lane present 
in the full-length strands was measured (FLS, Fig. le). During electrophoresis, 
partial hydrolysis caused signal from the FLS to smear down. To correct for this, a 
fully ligated plasmid was cleaved and analysed on the same gel. The percentage of 
signal in FLS band of the fully ligated plasmid was measured (FLS"") and used to 
correct signal in the other lanes to yield an accurate measure of ligation. Ligation 
(%) was calculated as FLS/FLS™ x 100, 

To monitor decatenation, 0.25-1.0. ng ull” of purified DNA was separated ona 
0.8% agarose gel at 4 V.cm__' and detected by autoradiography. Decatenation (%) 
was measured as the percentage of total signal in each lane present in circular 
monomers (CMs, Fig. 1g). 

To monitor DNA synthesis within a lacO array (Fig. 2), 0.25-1.0ngul * 
of purified DNA was incubated in buffer 3.1 with 0.2 unitspl~’ Pvull and 
0.2 units pl~’ AflIII (New England BioLabs) at 37 °C for 1h. Digested products 
were separated on a 1.2% agarose gel at 4Vcm ' and detected by autoradio- 
graphy. To measure array synthesis (SYN“*”), the 0.5-1.5-kb region of each lane 
was quantified (lins and DYs, Fig. 2b). To measure vector synthesis (SYNY®%), the 
2-6 kb region of each lane was quantified, which included the ~3.0 and ~6.0 
bands that arose when one, or both, lagging strands did not cut, respectively. Total 
signal in each lane (SYN?°') was also measured. To correct for differences in 
efficiency of DNA extraction, total lane signal was also measured in a set of 
unprocessed samples (SYNU), which were separated and detected in parallel. 
Array synthesis (%) was calculated as SYNUN/SYN'°! x SYNA®Y, vector syn- 
thesis was calculated as SYNUN/SYN‘°! x SYNY® and in both cases the 10 min 
time point was assigned a value of 100%. The same approach was also used to 
quantify synthesis of the 294/794 bp fragments (quantified in the same manner as 
the array) and the 2,354 bp fragments (quantified in the same manner as the 
vector fragments) in Extended Data Fig. 1. In Fig. 2 and Extended Data Fig. 1, a 
longer exposure of the array fragment is shown because it is less intense than the 
vector fragment. 

To analyse topoisomers (Extended Data Fig. 3d), 0.25 ng pl of radiolabelled 
DNA was incubated in 1X buffer A and 1X buffer B (Topogen) with 0.2 U ul? 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Human Topo II-« (Topogen) at 37°C for 15 min, or in CutSmart buffer with 
0.4U pl? Xmnl or 0.04 U pl? Nt.BspQI (New England Biolabs) for 1h. 
Nascent strand analysis. To nick rightward leading strands, 1-2 ng pl‘ of puri- 
fied DNA was incubated in buffer 3.1 with 0.4 units pl’ Nt.BspQI (New England 
BioLabs) at 37 °C for 1h. To nick leftward leading strands, 1-2 ng ul of purified 
DNA was incubated in CutSmart buffer with 0.04 units pl~’ Nb.BsrDI (New 
England BioLabs) at 65 °C for 1h. To nick rightward leading strands closer to 
the lacO array, 1-2 ng pl of purified DNA was incubated in CutSmart buffer 
with 0.04 units ul” ’ Nb.BbvCI (New England BioLabs) at 37°C for 1h. To nick 
leftward lagging strands, 1-2 ng pil of purified DNA was incubated in buffer 3.1 
with 0.04 units pl~? Nb.BtsI (New England BioLabs) at 37°C for 1h. In all 
cases, nicking reactions were stopped by the addition of 0.5 volumes of Stop 
solution B (95% formamide, 20 mM EDTA, 0.05% bromophenol blue, 0.05% 
xylene cyanol FF). 

Nicked DNA (1.5-2 p11 sample) was separated on a 42-cm-long 4% or 5% 
polyacrylamide sequencing gel using Model S2 sequencing gel apparatus 
(Apogee Electrophoresis, Baltimore, MD) according to the manufacturer’s guide- 
lines. To maximize the range of nascent products that could be resolved, gels 
were cast with a thickness gradient of 0.4 to 1.2 mm, beginning to end, to establish 
an electrical field gradient during electrophoresis. Sequencing gels were 
prepared with Rapidgel-XL in 0.8X GTG buffer (USB Corporation, Cleveland). 
Sequencing ladders were generated using the Cycle Sequencing Kit (USB 
Corporation, Cleveland) with primers JDO107, JDO109, JDO110, JDO111 
(Extended Data Table 1) and pJD150 (Extended Data Table 1) as template DNA. 

Mapping and quantification of the nascent strands in Figs 3 and 4 was per- 

formed as follows. Nascent leading and lagging strands were mapped using the 
sequencing ladders generated by the primers indicated in Fig. 3a and Fig. 4a (see 
Extended Data Table 1 for sequences). Slight discrepancies may exist between 
mapped and actual lagging strand product sizes (Fig. 4c) since the sequencing 
ladder (generated by JDO110 Fig. 4a) is complementary to the lagging strands. A 
fraction of lagging strand products 176-302 were not extended upon IPTG addi- 
tion, probably because they were reached by the rightward leading strand first. 
Lagging strand products 312-417 appeared de novo after IPTG addition, and 
therefore represent growing lagging strands of the leftward fork. To quantify 
leading strand progression (Fig. 3c, d), leading strands whose 3’ ends were located 
before lacO7 (in Fig. 3b and data not shown) were quantified, and peak signal was 
assigned a value of 100 (%Max). 
ChIP and quantitative PCR. ChIP and quantitative PCR (qPCR) were per- 
formed essentially as described’. Chromatin was withdrawn and crosslinked in 
the presence of 1% formaldehyde for 10 min at room temperature. Crosslinking 
was then quenched by the addition of 0.1 volumes glycine (1.25 M) for 10 min. 
Samples were then spun through Bio-Spin P-6 Gel (containing Tris Buffer, Bio- 
Rad) to remove salts and small molecules, before being stored in 10 volumes of 
sonication buffer (20 mM Tris pH 7.5, 150 mM NaCl, 2mM EDTA, 1% IGEPAL 
CA-630 (v/v), 2mM PMSF, 5 pg pl aprotinin, 5 jg pl’ leupeptin). Samples 
were then sonicated to shear chromatin into approximately 250 bp fragments. 

The antibodies used were described previously”. Antibodies were incubated 
with chromatin overnight at 4°C, then immunoprecipitated by addition of 
Protein A-Sepharose Fast Flow beads (GE Healthcare) for 2h at room temper- 
ature. Beads were washed sequentially with sonication buffer, high salt buffer 
(sonication buffer supplemented with 500 mM NaCl and 100mM KCl), wash 
buffer (10 mM Tris pH 7.5, 0.25 M LiCl, 1mM EDTA, 0.5% NP-40 (v/v), 0.5% 
SDS (w/v)) and TE (10 mM Tris pH 7.5, 1 mM EDTA), before being eluted into 
elution buffer (50 mM Tris pH 7.5, 10 mM EDTA, 1% SDS) at 65 °C for 20 min. 
Eluted chromatin, and input samples, were treated with RNase for 30 min at 
37 °C. Finally, proteins were degraded by addition of NaCl (250 mM final) and 
treatment with Pronase (2 1g pl’ final) at 42 °C for 6 h. DNA-peptide crosslinks 
were reversed by treatment at 70°C for a further 9h. DNA was subsequently 
phenol:chloroform extracted and ethanol precipitated. The absolute amount of 
DNA recovered from the immunoprecipitated and input samples was measured 
by quantitative PCR (qPCR) relative to a standard curve. The qPCR primers used 
are listed in Extended Data Table 1. Binding was measured as the percentage 
recovery of immunoprecipitated DNA, relative to the input (EXP**°), 

To minimize error in the ChIP process, an internal control was built into all 
experiments. Xenopus egg extracts were used to separately replicate a different 


plasmid, pQUANT (see Extended Data Table 1 for sequences). Mid-way through 
replication, PQUANT was crosslinked, quenched and spun through Bio-Spin 
P-6 gel (as above) to yield a single pool of heterologous chromatin that was 
bound by replication proteins. An equal amount of pQUANT chromatin 
was added to all experimental chromatin samples before sonication, and this 
was carried through the entire ChIP procedure. For each set of immunopreci- 
pitations, the recovery of pQUANT (QNT™°) should be identical between 
samples. To correct for technical variation in any set of immunoprecipitations, 
average pPQUANT recovery was calculated (QNT4YS) and normalized recovery 
(%) was calculated as EXPREC*QNTAYS/QNT®©, This ensured that the only 
sources of technical variation were the crosslinking process and the qPCR. To 
maximize the reliability of the qPCR, these measurements were performed 
in quintuplicate and the median value was used. Where three ChIP experi- 
ments were combined and plotted as mean + s.d. (Fig. 5a-c and Extended 
Data Figs 7f-i and 8j) it was necessary to normalize the data to correct for 
differences in absolute IP efficiency between experiments. For each protein 
measured by ChIP, mean recovery across all loci in all samples (mean*”) was 
calculated for each experiment (mean*"", mean™? and mean™) and used to 
generate a correction factor for each experiment (for example, for experiment 
1 the correction factor is [(mean*” + mean™? + mean™)/3]/mean™”). To mea- 
sure dissociation (Fig. 5a—c), recovery of the FLK2 locus was measured (shown in 
Extended Data Figs 7f-i and 8}) and peak signal was assigned a value of ‘0’, while 
background signal (measured at 4 or 5 min for Fig. 5a, or 10 min for Fig. 5b, c) 
was assigned a value of 100. The experiments shown in Fig. 5a and Extended 
Data Fig. 7f-i were repeated three times, once with p[/acOx12] and twice with 
pllacOx16]. 

Plasmid pull downs. Plasmid pull downs were performed essentially as 
described”, with the following exceptions. Beads were resuspended in buffer 
supplemented with 4% DMSO and 100 1M NMS-873** to block further CMG 
unloading once the samples were withdrawn”. Plasmid-associated proteins from 
40-80 ng of plasmid were isolated, and a quarter of the sample was analysed by 
western blotting using previously described antibodies against CDC45, MCM7 
and PCNA“. 

HIS,-Ub immunoprecipitations. Ni-NTA Superflow Resin (Qiagen) was 
washed three times with Urea buffer (10 mM imidazole, 0.2% NP-40, 8 M urea, 
500 mM NaH,PO,, 50mM Tris HCl, pH 8.0). For each immunoprecipitation, 
10 pl of resin was added per tube, and resuspended to 191 ul in Urea buffer. 
Extracts were supplemented with 100 uM of HIS,-ubiquitin (Boston Biochem) 
and replication was carried out as described above. At the indicated time, 9 tl of 
extract was mixed with the bead mix and samples were incubated for 1h at room 
temperature, with end-over-end rotation. Resin was washed three times with urea 
buffer. All residual buffer was removed, and resin was boiled for 5 min in 30 pl 
sample buffer (125 mM Tris-HCl pH 6.8, 20% glycerol, 6.1% SDS, 0.01% bromo- 
phenol blue, 10% B-mercaptoethanol). 30 pl of 0.5 M imidazole was added to each 
sample and HIS,-tagged proteins were eluted off the resin for 60 min at room 
temperature, with gentle agitation. Resin was spun down at 1,000 RCF for 1 min, 
and the supernatant was removed. 10 1l of each sample was resolved on an SDS- 
PAGE gel alongside an input control and analysed by western blotting using the 
previously described antibody against MCM7”. In Extended Data Fig. 8d, a 
longer exposure of the IP lanes is shown, since they are far less intense than the 
input lanes. 


45. Prokhorova, T.A., Mowrer, K., Gilbert, C. H. & Walter, J. C. DNA replication of mitotic 
chromatin in Xenopus egg extracts. Proc. Nat! Acad. Sci. USA 100, 13241-13246 
(2003). 

46. Lebofsky, R., Takahashi, T. & Walter, J.C. DNA replication in nucleus-free Xenopus 
egg extracts. Methods Mol. Biol. 521, 229-252 (2009). 

47. Budzowska, M., Graham, T. G. W., Sobeck, A., Waga, S. & Walter, J. C. Regulation of 
the Rev1-pol¢ complex during bypass of a DNA interstrand cross-link. EMBOVJ. 34, 
1971-1985 (2015). 

48. Magnaghi, P. etal. Covalent and allosteric inhibitors of the ATPase VCP/p97 induce 
cancer cell death. Nature Chem. Biol. 9, 548-556 (2013). 

49. Walter, J.C. & Newport, J. Initiation of eukaryotic DNA replication: origin unwinding 
and sequential chromatin association of Cdc45, RPA, and DNA polymerase a. Mol. 
Cell 5, 617-627 (2000). 

50. Long, D.T., Raschle, M., Joukov, V. & Walter, J. C. Mechanism of RAD51-dependent 
DNA interstrand cross-link repair. Science 333, 84-87 (2011). 


©2015 Macmillan Publishers Limited. All rights reserved 


1* Dimension: Size 


a 
o ——> 
a 
2 =a 
: —— 
E ss 
2 >_< 
nN 
Xmnl 1X spot 
Double-Y and Bubble Arcs 
b c 
p[empty] p[/acOx16] 


Pvull 
Pvull 


2354 b 
Faemett 


100 f 100 
So ge 
3 260 g 3 60 
5540 ge 
= 520 ‘2 520 
0 0 
05 #2 #45 7 95 05 2 #45 7 96 
Time (min) Time (min) 
-= Buffer (294 bp Fagment) ~= Buffer (794 bp Fagment) 
~= Buffer (2354 bp Fragment) = Buffer (2354 bp Fragment) 
~e LacR (294 bp Fagment) ~e LacR (794 bp Fagment) 
> 


LacR (2354 bp Fragment) ~® LacR (2354 bp Fragment) 


Extended Data Figure 1 | Sequence-specific termination can be induced at a 
LacR array. a, To investigate whether a LacR array blocks replication forks, a 
plasmid containing a tandem array of 16 lac operator (lacO) sequences, 
p[lacO;6] (or p[lacOx16]), was incubated with buffer or LacR and then 
replicated in egg extract containing [o-**P]dATP. Radiolabeled replication 
intermediates were cleaved with XmnI (far left cartoon) and separated 
according to size and shape by 2D gel electrophoresis (see schematic of 2D gel). 
As replication neared completion at 4.5 min, mainly linear molecules were 
produced in the presence of buffer (orange arrowhead). In contrast, in 

the presence of LacR, a discrete spot appeared on the double-Y arc (blue 
arrowhead), demonstrating that converging replication forks accumulate at a 
specific locus on p[lacO,.]. These data indicate that 16 copies of LacR block 
replication forks. b-f, To test whether the double-Y structures observed in panel 
a arose from replication forks stalling at the outer edges of the /acO array, we 
tested whether LacR specifically inhibited replication of lacO sequences. To this 
end, p[/acOj,] (c) and the parental plasmid lacking lacO repeats, p[empty] (b), 


ARTICLE 


Buffer 


2.0 min 4.5 min 4.5 min 


tl 


d 
Afilll-Pvull Digest 
Buffer LacR Buffer LacR 
p[empty] p[lacOx16] 
min] SS re 12 ROD (SIO) 19 
NTRONTBRONTRONTEH ® 

of 
20 
35 
NE [eee 

= 
Bo 
sé 
2? | —— — 

& 

rH 
Bo 
36 
Rs | 

é 


123 45 6 7 8 9 10 11 12 13 14 15 16 


were incubated in the presence of buffer or LacR and replicated using Xenopus 
egg extracts containing [a-**P]dATP. Radiolabelled replication intermediates 
were cleaved with AflIII and Pvull to release the 2,354-bp plasmid backbone 
(b and c) and a 294-bp control fragment from p[empty] (b) or a 794-bp lacO 
fragment from p[lacO,] (c). The plasmid backbone and the respective inserts 
were separated on a native gel and detected by autoradiography (d). A 
longer exposure of the small fragments is shown, since they are less intense 
than the large fragments. The results in panel d were quantified in e and 
f. Notably, LacR specifically inhibited replication of the lacO-containing 
fragment in p[lacO,,] (f, blue circles) but not the control fragment in 
plempty] (e, green circles). We conclude that LacR prevents replication of 
the lacO array and that the double-Ys in panel a represent forks converged 
on the outer edges of the array. Importantly, synthesis within the 2,354-bp 
backbone fragment (f, orange circles) of p[/acO,.] was not inhibited in 
the presence of LacR, indicating that no global structural changes occur that 
inhibit replication. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a AlwNI b 
IPTG (at 5’) 


Nt.BspQl 


490 bp array 
1 
' 1 
1234567 8 9 10111213141516 


Nb.BsrDI Nb.BsrDI 


; 300 bp 250 bp aa 
SS 
min| 545 nts 495 nts i 
FLs| c NtBspQl Nb.BsrDI 
Rws| £ IPTG (at 5’) s IPTG (at 5’) 
KB@ —tacR =O KB@ LacR__—iKB 
4 ooww oO Ha !) & oo fn ONO OW OO 
min SO NB IS Ol = SG BO) A108 ES OS 
onnnwno or onononnn oOo oO - 


LWS 


® *. 


12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 


d Rightward Strands e Leftward Strands 
Nt.BspQ! Digest Nb.BsrDI Digest 
» =100 oo o100 
2S 80 2S 80 
s gs 
= 60 6 = 60 
22 40 PE 40 
av 0 ov 0 
5.0 5.25 5.5 5.75 6.0 6.25 5.0 5.25 5.5 5.75 6.0 6.25 
Time (min) Time (min) 


Extended Data Figure 2 | Supplementary fork progression data. a, The gel 
shown in Fig. le was overexposed and shown in its entirety so that the smaller 
leftward strands (LWS, Fig. 1d) could be detected. As observed for the 
rightward strands (RWS, Fig. le), LWS rapidly increased in size and then 
disappeared as they were ligated to produce full-length strands (FLS, Fig. le). 
b-e, To determine whether the heterogeneity of LWS (a) and RWS (Fig. le) was 
due to delayed extension of lagging strands, or because a significant fraction 
of leading strands did not restart upon IPTG addition, we specifically 
monitored leading strand progression upon IPTG addition on p[/acOj.]. To 
this end, DNA samples were treated with Nt.BspQI or Nb.BsrDI to specifically 
liberate the rightward or leftward leading strands, respectively (b), and 
DNAs were separated on a denaturing agarose gel (c). Before IPTG addition, 


1.5 

15 
1:2 
1.0 12. 
0.9 1.0 
0.8 09 


= 
© © Orr: 
* eee 


aif S 


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 


discrete leading strand products of the expected size were observed (lanes 2 and 
10). The presence of two stall products reflects the fact that at a slow rate, 

the replisome bypasses LacR (see also Fig. 3). Upon IPTG addition, these 
species rapidly and completely shifted up the gel, indicating that rightward and 
leftward leading strands restarted efficiently. Therefore, the heterogeneity of 
the LWS (a) and RWS (Fig. le) is probably due to delayed ligation of a new 
Okazaki fragment to the lagging strands. Quantification of leading strands that 
had not reached the midpoint of the array (rightward and leftward strands 
smaller than 550 and 500 nt, respectively, b) revealed that by 6.25 min, 90% of 
rightward and leftward leading strands passed the midpoint of the array 

(d, e). This demonstrates that leading strands pass each other when forks meet. 
KB, kilobase ladder, with the length of each band (in kilobases) labelled. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


2 IPTG (at 5’) CO) in 5 
LacR = 
minj}Oo NO NONHNONH NOG o & 
nooo oO oO oO OnNnKK KR OH n-sc s E 
——=—o— 
OJ 2°" — [SOROS SeSe Topo lla @ 
> n 
—_— MCM7 © 7 
COV se 
123 4 5 6 7 8 9 10 11 12 13 14 
c d Mock ATopoll e 
Mock ATopoll = = 
—_ 3 SG 3 G&G 
IPTG IPTG —? 
(at?) _(at7?) SEagaRs eae 
eae we SRF ER EC we 
KB Ronn nnn KB pBS 
= ee A OM S AO — c i 
=]7~O OW 
ie ElL~ nao 
10 je te - n-n 
6.0 - —_ me G.0 8 ~# n-sc 
_—- 5.0 ~ 
4.0 - Se — 5.0 sc-sc 
= ~™ 3.0 an eR a i=) 40 ———en 
Poy pind 
i —_—— 2.0 = ED a> 30 ——@@ sx 
15 
12345678 - 1234 
= ~ 2.0 
= me 15 
1.2 3 4 5 6 7 8 


Extended Data Figure 3 | Topo-II-dependent decatenation of p[lacO,.]. 

a, The autoradiograph in primary Fig. 1g is reproduced with cartoons 
indicating the structures of the replication and termination intermediates n-n, 
n-sc, sc—sc, n and sc (see Fig. 1 for definitions). The order of appearance of the 
different catenanes matches previous work’ (n-n, then n-sc, then sc-sc). 

b-d, To determine the role of Topo II during termination within a lacO array, 
termination was monitored in mock- or Topo-II-depleted extracts. To confirm 
immunodepletion of Topo II, mock and Topo-II-depleted NPE was blotted 
with MCM7 and Topo II antibodies (b). p[/acO,¢] was incubated with LacR, 
then replicated in either mock- or Topo-II-depleted egg extracts in the presence 
of [x-*“P]dATP, and termination was induced with IPTG (at 7 min). Untreated 
DNA intermediates were separated by native gel electrophoresis (c). In the 
mock-depleted extract, nicked and supercoiled monomers were readily 
produced (as in panel a, albeit with slower kinetics due to nonspecific inhibition 
of the extracts by the immunodepletion procedure), while in the Topo-II- 
depleted extracts, a discrete species was produced. DNA from the last time point 
in each reaction (lanes 4 and 8 in panel c) was purified and treated with XmnI, 
which cuts p[lacO,.] once, or Nt.BspQI, which nicks p[lacOx16] once, or 
recombinant Topo II, and then separated by native gel electrophoresis (d). 
Cleavage of the mock- and Topo-II-depleted products with XmnlI yielded 

the expected linear 3.15-kb band (lanes 2 and 6), demonstrating that 


in both extracts all products were fully dissolved topoisomers of each other. 
Relaxation of the mock-depleted products by nicking with Nt.BspQI yielded a 
discrete band corresponding to nicked plasmid (lane 3), while the Topo-II- 
depleted products were converted to a ladder of discrete topoisomers (lane 7), 
which we infer represent catenated dimers of different linking numbers, 
since the mobility difference cannot be due to differences in supercoiling. 
Importantly, the mobility shift after Nt.BspQI treatment (lane 5 versus lane 7) 
demonstrated that the Topo-II-depleted products (lane 5) were covalently 
closed and thus in the absence of Topo II, ligation of the daughter strands still 
occurred. Treatment of the mock- and Topo-II-depleted products with 
recombinant human Topo II produced the same relaxed monomeric species 
(lanes 4 and 8), further confirming that the Topo-II-depleted products 
contained catenanes. Collectively, these observations demonstrate that 
termination within a lacO array in Topo-II-depleted extracts produces highly 
catenated supercoiled-supercoiled dimers, as seen in cells lacking Topo II’*”’. 
These data confirm that Topo II is responsible for decatenation and argue that 
termination within a lacO array reflects physiological termination. e, n—n, 
n-sc, sc-sc, n and sc products were also detected when plasmid lacking 
lacO sequences (pBlueScript) was replicated in the absence of LacR without 
the use of cyclin A to synchronize replication. Therefore, these intermediates 
arise in the course of unperturbed DNA replication in Xenopus egg extracts. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a 
Xmnl Digest Dissolution 
+IPTG ff 
=——re SSE =a 
— I eT => 
Double Ys (DYs) Double Ys (DYs) Linears (Lins) Linears (Lins) 
b LacR 
OxlacO 12xlacO 16xlacO 32xlacO 48xlacO 
ml,2eg¢s8 ,ergees8 .eeee8 .2ee88 weresés 
DYs = - 
Uns | — — —_ 
c ' 00 
Peas 4 0x/lacO 
2 60 12 x lacO 
s ~® 16 xlacO 
3 40 ~ 32 x lacO 
B 20 “@ 48 x lacO 
O 4 


5 10 20 40 80 160 
Log[Time (min)] 


Extended Data Figure 4 | Inhibition of termination by different-sized LacR 
arrays. a, Cartoon depicting intermediates detected in the dissolution assay. 
b, To determine the ability of different-sized LacR arrays to inhibit termination, 
the earliest stage of termination, dissolution (a), was monitored in plasmids 
containing 0, 12, 16, 32, or 48 lacO repeats. Plasmids were incubated with LacR, 
and replicated in the presence of [o-**P]dATP. To measure dissolution, 
radiolabelled termination intermediates were cut with XmnI. Cleaved products 


were separated on a native agarose gel and detected by autoradiography. 

c, Quantification of dissolution in b. When 12 or more lacO repeats were 
present in the array, dissolution was robustly inhibited for at least 5 min. Potent 
inhibition lasted 10 min when 32 lacO sequences were present, and 20 min 
in the presence of 48 JacO sequences. In the absence of lacO sequences, 


dissolution was essentially complete by 5 min. Therefore, 12 lacO repeats are 
sufficient to inhibit termination for 5 min. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


@ ee 
a @ 
>>> =—>> —_ 
+IPTG J 
8s +[a-P*]dATP @s Catenanes (Cats) Circular 
Monomers (CMs) 
p[/acOx12] 
IPTG+[a-P”]dATP (at 5’) 
b LacR Ss c Experiment A 
JSESSERSSERS ES 100 
minlo 6 6 6 bb CC Oo GC = 
KB 80 
Qs SS S388 Ee 
_ Da 
Cats = = 40 
CMs =; a 
0 
=? 5 55 6 65 
12 3 45 67 8 9 10 Time (min) 
~» Dissolution (%) 
-® Synthesis (%Peak) 
Mean ¢ s.d. of 
d Experiment B e Experiment C f Experiments A,B,C 
100: 100 100: 
80 80 80 
380 E60 #60 
@ 40 & 40 & 40 
20 20 20 
0 0) 0 
5 5.5 6 6.5 5 5.5 6 65 5 5.5 6 
Time (min) Time (min) Time (min) 
7 Dissolution (%) -»» Dissolution (%) > Dissolution (%) 
-® Synthesis (%Peak) ~@ Synthesis (%Peak) -® Synthesis (%Max) 
g p[lacOx16] h 
IPTG+[a-P32]dATP (at 5’) 
LacR 
Pre) (cam ei ac i = a 
miso 8 8238 58 80s 80 
B60 
@s 5 
+ aw 40 
Cats 20 
0 
CMs 5 55 6 65 
Time (min) 
- 
~» Dissolution (%) 
-@ Synthesis (%Max) 
123 4 5 6 7 8 9 10 11 


Extended Data Figure 5 | The rate of total DNA synthesis does not slow 
before dissolution. a—c, To test further whether replication stalls or slows 
before dissolution, p[/acO,2] was pre-incubated with LacR and replicated in 
Xenopus egg extracts. Termination was then induced by addition of IPTG after 
5 min. Simultaneously, [«-**P]dATP was added to specifically radiolabel 
DNA synthesized after IPTG addition (a). Radiolabelled DNA was then 
separated on a native agarose gel and total signal was measured by 
autoradiography (b). Total signal was quantified, normalized to peak signal, 
and graphed alongside the rate of dissolution, which was also measured in the 
same experiment (c). This approach gives a highly sensitive measure of 
DNA synthesis without manipulation of DNA samples. DNA synthesis should 
occur primarily within the lacO array (see Extended Data Fig. 1). Upon 
IPTG addition, there was an approximately linear increase in signal, which 
plateaued by 5.83 min. Importantly, dissolution was 65% complete by 5.83 min. 
Therefore, the large majority of dissolution occurs without stalling of DNA 


synthesis. d, e, Experimental repeats of b, c. f, The experiments shown in 

c-e were graphed together with mean + s.d. Synthesis data were normalized so 
that for each experiment, synthesis at 1 min was assigned a value of 84.4%, 
since this was the average value from c, d, where synthesis was allowed to 
plateau. Given the rate of replication fork progression in these egg 

extracts (260 bp min! (ref. 32)) and the size of the array (365 bp), forks 
should require, on average, 0.7 min to converge if no stalling occurs 

((365 bp/2)/260 bp min” ' = 0.7 min). The time required for dissolution was 
not appreciably longer than this (dissolution was 50% complete by 0.67 min 
after IPTG addition, f), consistent with a lack of stalling. g, h, The experiment 
shown in b, c was repeated using p[/acO,¢]. Synthesis was approximately 
linear until 6.17 min, at which point 81% of molecules had dissolved, further 
demonstrating that the majority of dissolution occurs without stalling of 
DNA synthesis. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


p[/acOx12]: 


p[/acOx32]: 


365 bp array 
2.3.4.5 6 7 8 9 101112 


— 


250 bp 
990 bp array 


a p[/acOx12] f 
[= 
2 IPTG 
3S 
a LacR 
ce] SSO SGKRMRONRMOARMDO OC 
Fon SHHMHOCHOMK MH GE 
Elo r= oBwhbKHKKDKH OOOO C 
8 vys| ---—-~ — 
i( 
° 
3 
6 *& Lins| ——_ i ———A“K“Hee | 
b 400 p[/acOx12] 
S80 9 
5 60 Butfer 
3 LacR 
9 40 © IPTG 
n 
B 20 
0 
5 55 6 65 15 
Time (min) 
Cc p[lacOx32] 
- 
2 IPTG 
3 
a LacR 
ec] OO Oo- MOF SO oO - 4 Oo So [=] 
Bog SH nngHowser- ang 8 
Elo CHK KDKDKHDOGCOGr 
& 7 Dys| atte eer ew Ps 
( 
[<} 
3 
6 * Lins| << -—_—- 
106 p[lacOx32] 
= 80 
§ 60 Buffer 
3 = LacR 
9 40 ~@ IPTG 
ao 
6 20 
0 
5 5 6 6. 15 
Time (min) 
e hel 
= 30: 
255 
c 
2 20 
= 
3 15 
> 10 
o 
£5 
i} 
On) 


12346567 8 91011121314 
Stall Product 


Extended Data Figure 6 | Replisome progression through 12 and 32 IacO 
arrays. a-d, To test whether replisomes meet later in a JacO3, array than a 
lacO, array, we monitored dissolution. LacR block-IPTG release was 
performed on p[lacO,,] and p[lacO3] and radiolabelled termination inter- 
mediates were digested with XmnI to monitor the conversion of double-Y 
molecules to linear molecules (dissolution). Cleaved molecules were separated 
on a native agarose gel, detected by autoradiography (a, c), and quantified 

(b, d). Upon IPTG addition, dissolution was delayed by at least 1 min within the 
32 lacO array compared to the 12 lacO array (b, d). Moreover, by 6 min, 92% 
of forks had undergone dissolution on p[/lacO,,] while only 9% had 

dissolved on p[lacO3,] (b, d). e, Stall products within the 12 lacO array (Fig. 3b, 
lane 2) were quantified, signal was corrected based on size differences of the 
products, and the percentage of stall products at each stall point was calculated. 
78% of leading strands stalled at the first three arrest points (red columns), 
19% stalled at the fourth to tenth arrest points (yellow columns) and the 
remaining 3% stalled at the tenth to fourteenth arrest points (grey columns). 
The appearance of fourteen arrest points is reproducible but surprising, given 
that the presence of only 12 lacO sequences was confirmed by sequencing in the 
very preparation of p[/acO,,] that was used in Fig. 3. The thirteenth and 


1112.13.14 15 _16_17_18 19 20 21 22 23 24 25 26 27 28 29 30 31 3fe K 
eee ee ee 


250 bp 


12xlacO 
IPTG 
LacR 


32xlacO 


SS8rSR68StaRgs 


GATC € DOCGHGHGOCOOS 


fourteenth arrest points cannot stem from cryptic lacO sites beyond the 
twelfth JacO site, as this would position the first leftward leading strand stall 
product ~90 nucleotides from the JacO array, instead of the observed ~30 
nucleotides (see f, g). At present, we do not understand the origin of these stall 
products. f, g, Progression of leftward leading strands into the array. The 
same DNA samples used in Fig. 3 were digested with the nicking enzyme 
Nb.BsrDI, which released leftward leading strands (f), and separated on a 
denaturing polyacrylamide gel (g). The lacO sites of p[/acOx12] are highlighted 
in blue on the sequencing ladder (g), which was generated using the primer 
JDO109 (green arrow, f). Green circles indicate two nonspecific products of 
digestion. These products arise because nicking enzyme activity varies between 
experiments, even under the same conditions. There was no significant 
difference in the pattern of leftward leading strand progression between the 12 
lacO and 32 lacO arrays, as seen for the rightward leading strands (Fig. 3b). 
Specifically, by 5.67 min, the majority of leading strands had extended beyond 
the seventh lacO repeat within JacO,, (lane 6) and the equivalent region of 
lacOz (lane 18). Therefore, progression of leftward leading strands is 
unaffected by the presence of an opposing replisome, suggesting that 
converging replisomes do not stall when they meet. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a b Cc f g j LAC FLK2 
es 9 Pole 8 Mcm7 
a gis Pole = 6 MCM7 8 me 
Vee = 3 Dx 6 
Rae Bs B, gs 8 ES 5 
My > = = 
Qe, rt 3 oo GS 4 
8 8 ES 4 ES 3 
225 © 2 so 3 So 5 
g g #2 | a 
al 
zo zo o} : 0 
5 6 7 8 9 5 6 * 8 9 5 6 7 8 5 6 ite 8 
time (min) time (min) time (min) time (min) 
“@Buffer -@-LacR -@IPTG “LAC = ~®FLK2 FAR 
d e h 
a 7 CDC45 35 RPA 
= 6 cepe45 = 25 RPA «6 = 304 
LAC (Multiple amplicons) = 220 BE 5 BE 25] 
FLK2 (-82 to -173 from LAC) $4 $ 15 ae a ae 20) 
FAR (-1286 to-1376 fromLAC) 8 3 10 ES 3 ES 15. 
a 2 « 23 2 $3 10 
n_ 22 ze 
g £5 = 4 5 
xs al 
z ot Zo ot 0 
5 6 F _8 6 £¢ 8 9g 5 6 7 6 7 
k time (min) time (min) time (min) time (min) 
1% Dimension: Size 
w ee LacR IPTG (at 5 min) IPTG (at 5 min) IPTG (at 5 min) IPTG (at 5 min) IPTG (at 5 min) IPTG (at 5 min) LacR 
o x 5.0 min 5.5 min 6.0 min 6.5 min 7.0 min 7.5 min 8.0 min g 8.0 min 
no 
< 
S 
7] 
2 
oO 
£ > 
fa) 
A ® a ro oO ® @ ° 


Extended Data Figure 7 | Supplementary ChIP data. a, Cartoon depicting 
the LAC, FLK2 and FAR loci, which were used for ChIP. Their precise locations 
relative to the leftward edge of the lacO array are indicated. The LAC amplicon 
is present in four copies distributed across the lacO,. array and three copies 
distributed across the lacO,, array. b-e, p[lacOx12] was incubated with buffer 
or LacR and termination was induced at 5 min by IPTG addition. MCM7, RPA, 
CDC45 and Pole ChIP was performed at different time points after IPTG 
addition but also in the buffer control and no IPTG control. Recovery of FLK2 
was measured as a percentage of input DNA. Upon IPTG addition, ChIP signal 
declined and by 9 min was comparable to the buffer control, demonstrating 
that unloading of replisomes was induced within 4 min of IPTG addition. f, To 
test whether movement of the replisome into and out of the JacO array could 
be detected upon IPTG addition, termination was monitored within a lacO 
array, and we performed ChIP of the leading strand polymerase Pole, which 
was inferred to move into and out of the array based on the behaviour of leading 
strands during termination (Extended Data Fig. 2b-e). It was predicted that 
Pole ChIP at the LAC locus should increase slightly as Pole enters the lacO array 
and decline again as converging polymerases pass each other, but persist at 
FLK2 while the polymerases move out of the array. Before IPTG addition, Pols 
was enriched at LAC and FLK2 compared to FAR, consistent with the leading 
strands being positioned on either side of the lacO array (Extended Data Fig. 2c 
and Fig. 3). Upon IPTG addition, Pole became modestly enriched at LAC 
compared to FLK2 (5.5 min) but then declined to similar levels at both LAC and 
FLK2 by 6.5 min. These data are consistent with the leading strand polymerases 
entering the lacO array and passing each other. g, h, To test whether CMG 
exhibited the same ChIP profile as Pole, MCM7 and CDC45 ChIP was 
performed using the same samples. After IPTG addition, MCM7 and CDC45 
were enriched at LAC compared to FLK2 (5.5 min), then declined to similar 
levels at both LAC and FLK2 by 6.5 min, as seen for Pole (f). These data are 
consistent with a model in which CMGs enter the array and pass each other 
during termination. A caveat of these experiments is the relatively high recovery 
of the FAR locus in MCM7, CDC45 and Pole ChIP. Specifically, signal was 
at most only ~2-fold enriched at LAC compared to FAR. This was not due to 
high background binding, because by the end of the experiment (10 min 
time point, not shown) we observed a decrease in signal of ~5-7-fold. 
Furthermore, we observed ~5-7-fold enrichment in binding (ChIP) of 
replisome components to p[/acO,,] that had been incubated in LacR compared 


to a buffer control (see g-i, below). Instead, the high FAR signal was 
probably due to poor spatial resolution of the ChIP. Consistent with this, when 
a plasmid containing a DNA interstrand cross-link (ICL) was replicated, 
essentially all replisomes converged upon the ICL but the ChIP signal for 
MCM7 and CDC45 was only ~3-4-fold enriched at the ICL compared to a 
control locus*'. We speculate that the higher background observed at the 
control locus in our experiments is due to the decreased distance of the control 
locus from the experimental locus (1.3 kb for p[lacO,.] and p[lacO,2] versus 
2.4kb for the ICL plasmid) and possibly due to increased catenation of the 
parental strands during termination. The high signal at FAR should not 
complicate interpretation of the MCM7, CDC45 and Pole ChIP (f), as signal at 
FAR was essentially unaltered between 5 and 6.5 min. Further evidence that the 
high signal seen at the FAR locus emanates from forks stalled near the lacO 
array is presented in panel k. i, ChIP of RPA was performed on the same 
chromatin samples used in b-d. As seen for Pole, MCM7 and CDC45, 
enrichment of RPA at LAC compared to FAR was relatively low, consistent with 
poor spatial resolution. j, Predicted binding of CMGs to the LAC, FLK2 and 
FAR loci before and after IPTG addition if converging CMG pass each other. 
k, To determine whether most forks stalled at the array and not elsewhere in 
the plasmid, we performed a time course in which p[lacO,.] undergoing 
termination was examined by 2D gel electrophoresis at various time points. 
pl/acO,.] was pre-bound to LacR and replicated in Xenopus egg extract 
containing [a-°*P]dATP. Termination was induced by IPTG addition and 
samples were withdrawn at different times. Radiolabelled replication 
intermediates were cleaved with XmnI (as in Extended Data Fig. 1a) and 
separated according to size and shape on 2D gels”. A parallel reaction was 
performed in which samples were analysed by ChIP, which was one of the 
repeats analysed in b-e. In the presence of LacR, a subset of double-Y molecules 
accumulated (blue arrowhead), demonstrating that 83% of replication 
intermediates (signal in dashed blue circle) contained two forks converged at a 
specific locus. After IPTG addition, linear molecules rapidly accumulated 
(orange arrowhead) as dissolution occurred. Importantly, the vast majority of 
signal was present in the discrete double-Y and linear species (blue and orange 
arrowheads), demonstrating that the relatively high ChIP signal observed at 
FAR in panels f-i was derived from forks present at the JacO,. array and not 
elsewhere. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


4 8 12 min 
KB -+ -+ - + UbVS 
a e 
10 
6 
[empty] ap 
plemp oo =— = —— IL 
2648 bp 3.0 | Lins 
2.0 
15 
4.0 
‘ . F : 30 - =e ee | Fis 
b -e- Dissolution -e- MCM7 Dissociation 
-e- Ligation CDC45 Dissocation 
100 a 
E 20 ee . oe 
= [oe 
S60} yl 
8 40 ei A 15 
§ 20 wi 
o! 
2 4 6 8 10 
Time (min) 2%, 
28 
$ 7 30 — CMs 
2. 4 12 min 56 oan oo 
= -+ -+ - + UbVS 15 
152 34 5 6 
wa ee - CDC45 
MCM7 f 
Poly-Ub ai 
ee “= MCM7 - 
2 60 
2 
aan Aes 
6 _ 
te Se oc —* UbVS +Ub 
2 4 6 8 10 
Time (min) 
HIS,-UB Input 
IP (10%) 
- + - + DNA 
d KDa 
Poly -170 
(Ub) 
8 -130 
* — 
MCM7 ==. 


Extended Data Figure 8 | Supplementary termination data for p[empty] 
experiments. a, Cartoon depicting the XmnI and AlwNI sites on p[empty], 
which are used for the dissolution and ligation assays, respectively, and the 
FLK2 locus, which is used for ChIP. b, Plasmid DNA without a lacO array 
(p[empty]) was replicated and at different times chromatin was subjected to 
MCM7 and CDC45 ChIP. Per cent recovery of FLK2 was quantified and used to 
measure dissociation of MCM7 and CDC45 (see Methods). Dissolution and 
ligation were also quantified in parallel. Mean = s.d. is plotted (n = 3). The 
MCM7 and CDC45 dissociation data are obtained from the vehicle controls in 
Fig. 5b, c, while the dissolution and ligation data are obtained from the vehicle 
controls in Fig. 5d, e. c, To seek independent evidence for the conclusions 

of the ChIP data presented in Fig. 5b, c, we used a plasmid pull-down 
procedure. p[empty] was replicated in egg extracts treated with vehicle or Ub- 
VS. At the indicated times, chromatin-associated proteins were captured on 
LacR-coated beads (which binds DNA independently of lacO sites) and 
analysed by western blotting for CDC45, MCM7 and PCNA. CDC45 and 
MCM7 dissociated from chromatin by 8 min in the vehicle control, but 
persisted following Ub-VS treatment. d, To test whether the MCM7 
modifications detected in panel c represented ubiquitylation, extracts were 
incubated with His.-ubiquitin in the absence of cyclin A, and in the absence or 
presence of plasmid DNA. After 15 min, Hisg-tagged proteins were captured 


Vehicle UbVs___ UbVs + Ub 
KB nvtoolntonleanvroos KB 
g 400 Dissolution 
10 S 80 
88 5 60 
40 3 
Lins| ™ @-ee3e «eee «eee ~*~?! a 40 -©- Vehicle 
6 20 ~»- UbVS 
2.0 ie -e- UbVS +Ub 
1.5 2 4 6 8 10 
1234567 8 9101112131415 Time (min) 
h 4.0 00 Ligation 
al = 
FLS| -2ee 3.0 = 80 
“e222 -see 5 co 
2 
2.0 20 ~e- Vehicle 
: 6 20 ~ UbVS 
hae -e- UbVS +Ub 
1.5 2 4 6 8 10 
Time (min) 
1.0 
1234567 8 9101112131415 
! 10 466 Decatenation 
30 E20 
4.0 5 60 
3.0 2 
wal 7 a4 -®- Vehicle 
= om —— 72 o 8 = = UbVS +Ub 
- - 
1.5 2 4 6 8 10 
Time (min) 
1234567 8 9101112131415 
j _") Mcw7r chip 4) CDC45 ChIP 
-e- Vehicle -®- Vehicle 
> UbV' 3 ~- UbVS 
-e UbVS +Ub ~e- UbVS +Ub 


FLK2 Recovery (%) 

oF NWA OD 

FLK2 Recovery (%) 
N 


4 6 8 10 4. 6 8 10 
Time (min) Time (min) 
by nickel resin pull down and blotted for MCM7. DNA replication greatly 
increased the levels of ubiquitylated MCM7, with the exception of a single 
species that was ubiquitylated independently of DNA replication (*). These 
data show that MCM7 is ubiquitylated during plasmid replication in egg 
extracts, as observed in yeast and during replication of sperm chromatin after 
nuclear assembly in egg extracts”*”’. e, In parallel to the plasmid pull downs 
performed in c, DNA samples were withdrawn for dissolution, ligation and 
decatenation assays, none of which was perturbed by Ub-VS treatment. These 
data support our conclusion, based on ChIP experiments (Fig. 5), that 
defective CMG unloading does not affect dissolution, ligation, or decatenation. 
f, Decatenation was measured in the same reactions used to measure 
dissolution and ligation (Fig. 5d, e), mean + s.d. is plotted (n = 3). g-i, Given 
the experimental variability at the 4 min time point in Fig. 5d-f, the primary 
data and quantification for dissolution (g), ligation (h) and decatenation (i) for 
one of the three experiments summarized in Fig. 5d-f is presented. This 
reveals that Ub-VS does not inhibit dissolution, ligation, or decatenation at the 
4 min time point. The same conclusion applies to two additional repetitions 
of this experiment (data not shown). j, The primary ChIP data used to 
measure dissociation of MCM7 and CDC45 in Fig. 5b, c is shown. Recovery 
of FLK2 was measured. Mean + s.d. is plotted (n = 3). 


©2015 Macmillan Publishers Limited. All rights reserved 


a Buffer CyclinA 
min] SSLSasre 
8s 
+ 
Cats 
-_ -———o«§* 
CMs 
-—e seo 
1234567 8 
c Buffer 
LacR 
p[empty] p[/acOx12] p[/acOx16] 
. nnn nonwn nonwMm wo 
min | Leow DK ow 2K oO 
Oorrr oFte ores 
DYs | “—-—-—- «ee 
inl —— 
123 4 567 8 9 10:11 12 
e CyclinA 
LacR 
p[empty] p[lacOx12] p[/acOx16] 
a onon wo onn wo nonnw 
min | YOON YOON Y¥uUOoOR 
DYs | -———— 
lin| — vad 
1234 567 8 9 10 11 12 


Extended Data Figure 9 | Cyclin A treatment synchronizes DNA replication 
in Xenopus egg extracts. a, b, To synchronize DNA replication in Xenopus 
egg extracts, we treated extracts with cyclin A, which probably accelerates 
replication initiation®. Plasmid DNA was incubated in High Speed 
Supernatant for 20 min, then either buffer or cyclin A was added for a further 
20 min. NucleoPlasmic extract was added to initiate DNA replication, along 
with [o-?P]dATP to label replication intermediates. Replication products were 
separated on a native agarose gel, detected by autoradiography (a), and 
quantified (b). In the presence of vehicle, replication was not complete by 

9.5 min, but in the presence of cyclin A, replication was almost complete by 
4.5 min (b). Thus, cyclin A treatment approximately doubles the speed of DNA 
replication in Xenopus egg extracts. c-f, To test whether cyclin A affects the 
ability of LacR to inhibit termination, we monitored dissolution of plasmids 
containing a 12 or 16 LacR array in the presence and absence of cyclin A. 
p[lacO,,], p[/acO;¢], and the parental control plasmid p[empty] were incubated 
with LacR, and then treated with buffer or cyclin A before replication was 
initiated with NPE in the presence of [«-**P]dATP. Samples were withdrawn 


ARTICLE 


b 
100 
= 80 
2 60 > Buffer 
= 40 © Cyclin A 
c 
B20 
0 
2 45 7 96 
Time (min) 
d 
__ 100 
= 80 
5 -® plempty] 
3 60 pilacOx12] 
9 40 © pliacOx16] 
2 20 
a 
0 
95 115 135 15.5 
Time (min) 
f 
_ 100 
= 80 
& @& plempty] 
S 60 
3 an p[/acOx12] 
[-} 
“& pllacOx16 
3 20 pl ] 


0 
45 55 65 75 
Time (min) 


when dissolution of p[empty] plateaued (9.5 min in the presence of buffer, 
4.5 min in the presence of cyclin A). Given that cyclin A treatment 
approximately doubles the speed of replication (see b), samples were 
withdrawn from these reactions twice as frequently as the buffer-treated 
samples. To measure dissolution, radiolabelled termination intermediates were 
cut with XmnI to monitor the conversion of double-Y molecules to linear 
molecules. Cut molecules were separated on a native agarose gel and detected 
by autoradiography (c, e). By the time the first sample was withdrawn, 
dissolution of p[empty] was essentially complete, in the absence (9.5 min, d) or 
presence (4.5 min, f) of cyclin A. Importantly, dissolution of p[/acO,2] and 
p[/acO6] was prevented in the absence (9.5 min, d) or presence (4.5 min, f) of 
cyclin A. Moreover, dissolution occurred approximately twice as fast in the 
presence of cyclin A (note the similarity between d and f even though samples 
are withdrawn twice as frequently in f) consistent with replication being 
approximately twice as fast in the presence of cyclin A. Therefore, cyclin A does 
not affect the ability of a LacR array to block replication forks. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Extended Data Table 1 | Tables of plasmids and oligonucleotides used 


A. 


Plasmid 


pJD82 
pJD85 
pJD88 
pJD92 
pJD100 
pJD104 
pJD105 
pJD139 
pJD145 
pJD150 
pJD152 
pJD156 
pQNT 


B. 
Oligo 


JDO38 


JDO39 


JDO42 
JDO43 
JDO94 
JDO95 
JDO100 
JDO101 


JDO107 
JDO109 
JDO110 
JDO111 
FLK2_F 
FLK2_R 
LAC_F 
LAC_R 
FAR_F 
FAR_R 
QNT_F 


QNT_R 


Insert 

Sacl-BsrGI-(lacO)x4-BsiWI-Kpnl 
Sacl-BsrGI-(lacO)x8-BsiWI-Kpnl 
Sacl-BsrGI-(lacO)x16-BsiWI-Kpnl 
Sacl-BsrGI-(lacO)x32-BsiWI-Kpnl 
Sacl-BsrGI-(lacO)x48-BsiWI-Kpnl 
Sacl-BsrGI-(lacO)x12-BsiWI-Kpnl 
Sacl-Nb.BsmI-BsiWI-Nt.BbvCl-Kpnl 
Sacl-Nb.BsmlI-BsiWI-Nt.BbvCl-Kpnl 
Sacl-Nb.BsmI-BsiWI-Nt.BbvCl-Kpnl 
Sacl-Nb.Bsml-(lacO)x12-BsiWI-Nt.BbvCl-Kpnl 
Sacl-Nb.Bsml-(lacO)x16-BsiWI-Nt.BbvCl-Kpnl 
Sacl-Nb.Bsml-(lacO)x32-BsiWI-Nt.BbvCl-Kpnl 


Sequence 


5 '-GTACATCAATTGTGAGCGGATAACAATTGTTA 
GGGAGGAATTGTGAGCGGATAACAATTTGGAGTTG 
ATAATTGTGAGCGGATAACAATTGGCTTCAACGTA 
ATTGTGAGCGGATAACAATTTCC-3' 


5 '-GTACGGAAATTGTTATCCGCTCACAATTACGT 
TGAAGCCAATTGTTATCCGCTCACAATTATCAACT 
CCAAATTGTTATCCGCTCACAATTCCTCCCTAACA 
ATTGTTATCCGCTCACAATTGAT-3' 


5 '-CTGTACAGCATTCCCATGGCGTACGTTCTAGA 
CCTCAGCTATGGTACC-3' 


5 '-AGCGGTACCATAGCTGAGGTCTAGAACGTACG 
CCATGGGAATGCTGTACAGAGCT-3' 


5 '-TAAGGGATTTTGCCGATTTCGGCCTATGCTCT 
TCGCAGTGTGGTTAAAAAATGAGC-3' 


5 '-GCTCATTTTTTAACCACACTGCGAAGAGCATA 
GGCCGAAATCGGCAAAATCCCTTA-3' 


5 '-TGAGCGTCGATTCATTGCTTTGTGATGCTCGT 
CAGGGGG-3' 


5 '-CCCCCTGACGAGCATCACAAAGCAATGAATCG 
ACGCTCA-3' 


5 '-CAGTGTGGTTAAAAAATGAGCTG-3' 
5'-CATTGCTTTGTGATGCTCGT-3' 

5 '-TGGTTAAAAAATGAGCTGATTTAACA-3' 
5 '-TGAGGTCTAGAACGTACGGAAA-3' 
5 '-TCTTCGCTATTACGCCAGCT-3' 

5 '-TTACAACGTCGTGACTGGGA-3' 

5 '-AGCGGATAACAATTGTTAGGGA-3' 
5 '-CTCACAATTACGTTGAAGCCAA-3' 
5 '-ATTGCTACAGGCATCGTGGT-3' 

5 '-GGGATCATGTAACTCGCCTTGA-3' 
5 '-TACAAATGTACGGCCAGCAA-3' 


5 '-GAGTATGAGGGAAGCGGTGA-3' 


Construction 


Replacement of the sequence between Sacl and KpnI of pBluescript II KS- 
JDO38/39 annealed and cloned into pJD82 that had been cut with BsrGl 
BsrGI/BsiWI fragment from pJD85 cloned into pJD85 that had been cut with BsrGI 
BsrGI/BsiWI fragment from pJD88 cloned into pJD88 that had been cut with BsrGI 
BsrGI/BsiWI fragment from pJD88 cloned into pJD92 that had been cut with BsrGI 
JDO38/39 annealed and cloned into pJD85 that had been cut with BsrGI 
Replacement of the sequence between Sacl and KpnI of pBluescript II KS- with JDO42/43 
Quickchange mutagenesis of pJD105 using JDO94/95 

Quickchange mutagenesis of pJD139 using JDO100/10 

BsrGI/BsiWI fragment from pJD104 cloned into pJD145 that had been cut with BsiWI 
BsrGI/BsiWI fragment from pJD88 cloned into pJD145 that had been cut with BsiWI 
BsrGIl/BsiWI fragment from pJD92 cloned into pJD145 that had been cut with BsiWI 
pCDFDuet-1 containing a Hincll site (pQuant from 4") 


Description 


Can be annealed to JDO39 to generate dsDNA containing 4x lacO sites with ends that are 
compatible with BsiWI and BsrGl. 


Can be annealed to JDO38 to generate dsDNA containing 4x lacO sites with ends that are 
compatible with BsiWI and BsrGl. 


Can be annealed to JDO43 to generate dsDNA containing sites for BsrGI-Nb.BsmI-Ncol- 
BsiWI-Xbal-Nb.BbvCl with ends that are compatible with Sacl and KpnI. 


Can be annealed to JDO42 to generate dsDNA containing sites for BsrGI-Nb.BsmI-Ncol- 
BsiWI-Xbal-Nb.BbvCl with ends that are compatible with Sacl and KpnI. 


Used with JDO95 to introduce Nt.BspQI and Nb.Bisl sites upstream of Nb.Bsml in 
pJD105-derived plasmids by Quickchange mutagenesis. 


Used with JDO94 to introduce Nt.BspQI and Nb.Bisl sites upstream of Nb.Bsml in 
pJD105-derived plasmids by Quickchange mutagenesis. 


Used with JDO101 to introduce Nb.BsrDI site downstream of BbvCl in pJD105-derived 
plasmids by Quickchange mutagenesis. 


Used with JDO100 to introduce Nb.BsrDI site downstream of BbvCl in pJD105-derived 
plasmids by Quickchange mutagenesis. 


Sequencing primer for mapping leading strands released by Nt.BspQI digestion 

Sequencing primer for mapping leading strands released by Nb.BsrDI digestion 

Sequencing primer for mapping lagging strands released Nb.Btsl digestion. 

Sequencing primer for mapping leading strands released by Nb.BbvCl digestion. 

Used with FLK2_R to amplify the region 82-173 bases upstream of the lacO array in pJD152 
Used with FLK2_F to amplify the region 82-173 bases upstream of the lacO array in pJD152 
Used with LAC_R to amplify four sites within the lacO array in pJD152 


Used with LAC_F to amplify four sites within the lacO array in pJD152 


Used with FAR_R to amplify the region 1286-1375 bases upstream of the lacO array in pJD152 
Used with FAR_F to amplify the region 1286-1375 bases upstream of the lacO array in pJD152 


Used with QNT_R to amplify pQNT 
Used with QNT_F to amplify pQNT 


©2015 Macmillan Publishers Limited. All rights reserved 


Male dea 


doi:10.1038/nature15262 


Relativistic boost as the cause of periodicity in a 
massive black-hole binary candidate 


Daniel J. D’Orazio', Zoltan Haiman! & David Schiminovich! 


Because most large galaxies contain a central black hole, and gal- 
axies often merge’, black-hole binaries are expected to be common 
in galactic nuclei’. Although they cannot be imaged, periodicities 
in the light curves of quasars have been interpreted as evidence for 
binaries**, most recently in PG 1302-102, which has a short rest- 
frame optical period of four years (ref. 6). If the orbital period of 
the black-hole binary matches this value, then for the range of 
estimated black-hole masses, the components would be separated 
by 0.007-0.017 parsecs, implying relativistic orbital speeds. There 
has been much debate over whether black-hole orbits could be 
smaller than one parsec (ref. 7). Here we report that the amplitude 
and the sinusoid-like shape of the variability of the light curve of 
PG 1302-102 can be fitted by relativistic Doppler boosting of emis- 
sion from a compact, steadily accreting, unequal-mass binary. We 
predict that brightness variations in the ultraviolet light curve 
track those in the optical, but with a two to three times larger 
amplitude. This prediction is relatively insensitive to the details 
of the emission process, and is consistent with archival ultraviolet 
data. Follow-up ultraviolet and optical observations in the next few 
years can further test this prediction and confirm the existence of a 
binary black hole in the relativistic regime. 

Assuming PG 1302-102 is a binary, it is natural to attribute its 
optical emission to gas that is bound to each black hole, forming 
circumprimary and circumsecondary accretion flows. Such flows, 
which form ‘minidisks’, are generically found in high-resolution 
two- and three-dimensional hydrodynamic simulations that include 
the black holes in their simulated domain*'*. Assuming a circular 
orbit, the velocity of the lower-mass secondary black hole is 


an \(GM\3 1.5 M Ver p VI 
v= =8,500 km s7! 
1+q/\4n’P 1+q/\108°Mo 4.04 yr 


or approximately 0.03c for the fiducial parameters chosen in the par- 
entheses on the right (q=0.5, M= 10°°Mo, P= 4.04yr), where 
M = M,+ Mz is the total binary mass, M, 2 are the individual masses, 
q = M2/M, = 1 is the mass ratio, Me is the mass of the Sun, P is the 
orbital period, Gis the gravitational constant, and cis the speed of light. 
The orbital velocity of the higher-mass primary black hole is v) = qvo. 
Even if a minidisk has a steady intrinsic rest-frame luminosity, its 
apparent flux on Earth is modulated by relativistic Doppler beaming. 
The photon frequencies suffer relativistic Doppler shift by the factor 
D= [r(1—B))] a where [= (1 — pon? is the Lorentz factor, 
Bb = v/cis the three-dimensional velocity v in units of the speed of light 
c, and f) = f cos(g)sin(i) is the component of the velocity along the 
line of sight, with i and @ the orbital inclination and phase, respectively. 
Because the photon phase-space density, which is proportional to 
F,/v", is invariant in special relativity, the apparent flux F, at a fixed 
observed frequency v is modified from the flux of a stationary source 
F) to Fy =D°F),,,=D°~*F). The last step assumes an intrinsic 
power-law spectrum F?ocv*. To first order in v/c, this assumption 
causes a sinusoidal modulation of the apparent flux along the orbit, 
by a fractional amplitude AF,/F, = +(3 — «)vcos(g/c)sin(i). Although 


light-travel time modulations appear at the same order, they are sub- 
dominant to the Doppler modulation. This modulation is analogous to 
periodic modulations from relativistic Doppler boost predicted’* and 
observed for extrasolar planets’”"* and for a double white-dwarf bin- 
ary’, but here it has a much higher amplitude. 

The light curve of PG 1302-102 is well measured over approximately 
two periods (approximately 10 yr). The amplitude of the variability is 
+0.14 mag (measured in the optical V band’°), which corresponds to 
AF,/F, = £0.14. The spectrum of PG 1302-102 in and around the 
V band is well approximated by a double power-law, with « ~ 0.7 
(between 0.50 um and 0.55 im) and « ~ 1.4 (between 0.55 um and 
0.6 um), except for small deviations caused by broad lines. We obtain 
an effective single slope a, = 1.1 over the entire V band. We conclude 
that the 14% variability can be attributed to relativistic beaming for a 
line-of-sight velocity amplitude of vsin(i) = 0.074c = 22,000kms''. 

Although large, this velocity can be realized for a massive (high-M) 
but unequal-mass (low-q) binary, whose orbit is viewed not too far 
from edge-on (high sin(i)). In Fig. 1, we show the required combina- 
tion of these three parameters that would produce a 0.14-mag vari- 
ability in the sum-total of Doppler-shifted emission from the primary 
and the secondary black hole. As this figure shows, the required mass is 
2 10°'Mo, consistent with the high end of the range that has been 
inferred for PG1302-102. The orbital inclination can be in the 
range i = 60°-90°. The mass ratio q has to be low, q < 0.3, which is 
consistent with expectations based on cosmological galaxy merger 
models”, and also with the identification of the optical and binary 
periods (for q 20.3, hydrodynamic simulations predict that the 
mass-accretion rates fluctuate with a period several times longer than 
the orbital period”’). 

As Fig. 1 shows, fully accounting for the observed optical variability 
also requires that the bulk of the optical emission arises from gas 
bound to the faster-moving secondary black hole (f; = 80%). We find 
that this condition is naturally satisfied for unequal-mass black holes. 
Hydrodynamic simulations have shown that for 0.03 < q < 0.1, the 
accretion rate onto the secondary black hole is a factor of 10-20 higher 
than that onto the primary’. Because the secondary captures most of 
the accreting gas from the circumbinary disk, the primary is ‘starved’, 
and radiates with a much lower efficiency. In the (M,q) ranges 
favoured by the beaming scenario, we find that the primary contributes 
less than 1% to the total luminosity, and the circumbinary disk con- 
tributes less than 20%, leaving the secondary as the dominant source of 
emission in the three-component system (see Methods). 

The optical light curve of PG 1302-102 appears remarkably sinus- 
oidal compared to that of the best-studied previous quasi-periodic 
quasar binary black-hole candidate, which shows periodic bursts*. 
Nevertheless, the light-curve shape deviates from a pure sinusoid. 
To see if such deviations naturally arise within our model, we max- 
imized the Bayesian likelihood over five parameters (period P, velocity 
amplitude K, eccentricity e, argument of pericentre «, and an arbitrary 
reference time fo) of a Kepler orbit” and fitted the observed optical 
light curve. In this procedure, we accounted for additional stochas- 
tic physical variability with a broken-power-law power spectrum 


1Department of Astronomy, Columbia University, 550 West 120th Street, New York, New York 10027, USA. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 351 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 

90 90 90 
° 
oO 
I 

80 80 80 = 

© 70} 70} 70} 
60 60 60 
f,=1.0 f, = 0.95 f,=08 
504? 50L_* 50k * 
1029 1021 1092. 1093 1094 1099 10%1 1092 1093 1094 102° 1091 1092 1093 1094 
M/Me MIM. M/Me 


Figure 1 | Binary parameters producing the optical flux variations of 

PG 1302-102 by relativistic boost. Combinations of total binary mass M, 
mass ratio q = M,/M,, and inclination i that cause >13.5% flux variability (or 
line-of-sight velocity amplitude (v/c)sin(i) = 0.07) in the emission from the 
primary and secondary black holes, computed from the Doppler factor D?~ % 
with the effective spectral slope of %,¢ = 1.1 in the V band. The solid lines 


(a ‘damped random walk™*) described by two additional parameters. 
This analysis returns a best-fit with a non-zero eccentricity of 
e=0.09*9;52, although a Bayesian criterion does not favour this 
model over a pure sinusoid with fewer parameters (see Methods). 
We considered an alternative model to explain optical variability of 
PG 1302-102, in which the luminosity variations track the fluctuations 
in the mass-accretion rate that is predicted in hydrodynamic simula- 
tions”"''!°?°, However, the amplitude of these hydrodynamic fluctua- 
tions are large (order one), and their shape is ‘bursty’ rather than 
sinusoid-like'’’*; as a result, we find that they provide a poorer fit 
to the observations (see Fig. 2 and Methods). Furthermore, for mass 
ratios q 20.05, hydrodynamic simulations predict a characteristic 
pattern of periodicities at multiple frequencies, but an analysis of 
the periodogram of PG 1302-102 has not uncovered evidence for mul- 
tiple peaks”®. 


14.4 I} 


14.6 b 


asad 


Magnitude (mag) 
= 
@ 

Py Oe 


15.0 + 


15.2 5 


1 1 1 1 n 
3,000 4,000 5,000 6,000 7,000 


MJD — 49,100 


2,000 


Figure 2 | The optical and ultraviolet light curves of PG 1302-102. The grey 
filled circles with 1 errors are the optical data®, superimposed with a best-fit 
sinusoid (red dashed curve). The solid black curve is the best-fit relativistic 
light-curve. The blue dashed curve is the best-fit model that was obtained by 
scaling the mass-accretion rate determined from a hydrodynamic simulation of 
an unequal-mass (q = 0.1) binary'!. The red and blue filled circles with 1o 
errors correspond to archival NUV (red) and FUV (blue) spectral observations; 
the red filled triangles (with 1o errors) represent archival photometric NUV 
data (see Fig. 3). The UV data include an arbitrary overall normalization to 
match the mean optical brightness. The red and blue dotted curves are the best- 
fit relativistic optical light curves with amplitudes scaled up by factors of 2.17 
and 2.57, which best match the NUV and FUV data, respectively. MJD, 
modified Julian day. 


352 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


correspond to different values of q as labelled; the shaded regions correspond to 
intermediate values. We assume that a fraction f, = 1.0, 0.95, or 0.8 of the total 
luminosity arises from the secondary black hole; these values are consistent 
with fractions found in hydrodynamic simulations'’ (see Methods). The 
inclination angle is defined such that i = 0° corresponds to a face-on view of 
PG 1302-102, and i = 90° corresponds to an edge-on view. 


A simple observational test of relativistic beaming is possible, owing to 
the strong frequency dependence of the spectral slope of PG 1302-102: 
a = din(F,)/din(v). The continuum spectrum of PG 1302-102 is nearly 
flat with a slope Bruy = din(F,)/din(A) = 0 in the far-ultraviolet (FUV; 
0.145-0.1525 um) band, where F, is the apparent flux at an observed 
wavelength /, and shows a tilt with yyy = —0.95 in the near-ultraviolet 
(NUV; 0.20-0.26 um) range; see Fig. 3 and Methods. These slopes trans- 
late to &pyy = —2 and oyuv = — 1.05 in the respective bands, compared 
tO Opt = 1.1 in the optical. The UV emission can be attributed to the 
same minidisks that are responsible for the optical light, and would 
therefore share the same Doppler shifts in frequency. These Doppler 
shifts would translate into UV variability that is larger by a factor of 
(3 _ aruy)/(3 = Oopt) = 5/1.9 = 2.63 and (3 = &nuv)/(3 = opt) = 4,05/ 
1.9 = 2.13 compared to the optical, and reaches maximum amplitudes of 
+37% (FUV) and +30% (NUV). 


6 x 10-14 


fp t—t-3 it 
FOS NUV (-280) 
STIS FUV (3,042) 
GALEX (5,433) 
GALEX (5,827) 
COS FUV (6,689) 


5x 10-4 


te 4x 10-4 

= 

[S) 

7 3x10-4 

n 

oD 

& 

3 2x10-4 

. Wr wad 
1x 10-4 


0 
1,400 1,600 1,800 2,000 2,200 


Wavelength (A) 


Figure 3 | Archival UV spectra of PG 1302-102 from 1992-2011. FUV and 
NUV spectra obtained by instruments on the HST and by GALEX, as labelled. 
COS, cosmic origins spectrograph; FOS, faint object spectrograph; STIS, 
space telescope imaging spectrograph. Numbers in brackets are the dates (in 
MJD — 49,100) the data were collected. Vertical yellow bands mark regions 
outside the spectroscopic range of both GALEX and the HST and contain no 
useful spectral data. Assignments of the main peaks are given. Lya, Lyman a. 
From each spectrum, average flux measurements (shown in Fig. 2) were 
computed in one or both of the UV bands over the frequency range indicated by 
the horizontal bars. The full GALEX photometric band shapes for FUV and 
NUV photometry are shown for reference as shaded blue and red curves, 
respectively. Additional GALEX NUV photometric data were also used 

in Fig. 2. The UV spectra show an offset by as much as +30% relative to one 
another, close to the value expected from relativistic boost (see Methods). 


2,400 2,600 


©2015 Macmillan Publishers Limited. All rights reserved 


Five separate UV spectra of PG1302-102 have been collected 
between 1992 and 2011, by instruments on the Hubble Space 
Telescope (HST) and on the Galaxy Evolution Explorer (GALEX) 
satellite (see Fig. 3); additional photometric observations were taken 
with GALEX at four different times between 2006 and 2009 (shown in 
Fig. 2). The brightness variations in both the FUV and NUV bands 
show variability that resembles the optical variability, but with a larger 
amplitude. Adopting the parameters of our best-fit sinusoid model, 
and allowing only the amplitude to vary, we find that the UV data 
yields best-fit variability amplitudes of AF,/F,|guy = £(35.0 + 3.9)% 
and AF,/F,|juv = £(29.5 + 2.4)% (shown in Fig. 2). These ampli- 
tudes are factors of 2.57 + 0.28 and 2.17 + 0.17 higher than in the 
optical, in excellent agreement with the values 2.63 and 2.13 that are 
expected from the corresponding spectral slopes. 

Relativistic beaming provides a simple and robust explanation of the 
optical periodicity of PG 1302-102. The prediction that the larger UV 
variations should track the optical light curve can be tested rigorously 
in the future with measurements of the optical and UV brightness that 
are collected at or near the same time, are repeated two or more times, 
are separated by a few months to about 2 yr, and cover up to half of the 
optical period. A positive result will constitute the first detection of 
relativistic massive black-hole binary motion; it will also serve as a 
confirmation of the binary nature of PG 1302-102, remove the ambi- 
guity in the orbital period, and tightly constrain the binary parameters 
to be close to those shown in Fig. 1. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 10 February; accepted 27 July 2015. 


1. Kormendy, J. & Ho, L. C. Coevolution (or not) of supermassive black holes and host 
galaxies. Annu. Rev. Astron. Astrophys. 51, 511-653 (2013). 

2. Begelman, M.C., Blandford, R. D. & Rees, M. J. Massive black hole binaries in active 
galactic nuclei. Nature 287, 307-309 (1980). 

3. Komossa, S. Observational evidence for binary black holes and active double 
nuclei. Mem. Soc. Astron. Ital. 77, 733-741 (2006). 

4. Valtonen, M. J. et al. A massive binary black-hole system in OJ 287 and a test of 
general relativity. Nature 452, 851-853 (2008). 

5. Liu, T. eta/. A periodically varying luminous quasar atz = 2 from the Pan-STARRS1 
Medium Deep Survey: a candidate supermassive black hole binary in the 
gravitational wave-driven regime. Astrophys. J. 803, L16 (2015). 

6. Graham, M. J. et al. A possible close supermassive black-hole binary in a quasar 
with optical periodicity. Nature 518, 74-76 (2015). 

7. Milosavljevic, M. & Merritt, D. in The Astrophysics of Gravitational Wave Sources, AIP 
Conf. Proc. (eds Centrella, J. M. & Barnes, S.) 686, 201-210 (AIP, 2003). 

8. Hayasaki, K., Mineshige, S. & Ho, L. C. Asupermassive binary black hole with triple 
disks. Astrophys. J. 682, 1134-1140 (2008). 

9. Shi, J.-M., Krolik, J. H., Lubow, S. H. & Hawley, J. F. Three-dimensional 
magnetohydrodynamic simulations of circumbinary accretion disks: disk 
structures and angular momentum transport. Astrophys. J. 749, 118 (2012). 


LETTER 


10. Roedig, C. et al. Evolution of binary black holes in self gravitating discs. Dissecting 
the torques. Astron. Astrophys. 545, A127 (2012). 

11. D'Orazio, D. J., Haiman, Z. & MacFadyen, A. Accretion into the central cavity of a 
circumbinary disc. Mon. Not. R. Astron. Soc. 436, 2997-3020 (2013). 

12. Nixon, C., King, A. & Price, D. Tearing up the disc: misaligned accretion on toa 
binary. Mon. Not R. Astron. Soc. 434, 1946-1954 (2013). 

13. Farris, B. D., Duffell, P., MacFadyen, A. |. & Haiman, Z. Binary black hole accretion 
froma circumbinary disk: gas dynamics inside the central cavity. Astrophys. J. 783, 
134 (2014). 

14. Dunhill, A. C., Cuadra, J. & Dougados, C. Precession and accretion in 
circumbinary discs: the case of HD 104237. Mon. Not. R. Astron. Soc. 448, 
3545-3554 (2015). 

15. Shi, J.-M. & Krolik, J. H. Three-dimensional MHD simulation of circumbinary 
accretion disks. Il. Net accretion rate. Astrophys. J. 807, 131 (2015). 

16. Loeb, A. & Gaudi, B. S. Periodic flux variability of stars due to the reflex 
Doppler effect induced by planetary companions. Astrophys. J. 588, L117-L120 
(2003). 

17. van Kerkwijk, M. H. et al. Observations of Doppler boosting in Kepler light curves. 
Astrophys. J. 715, 51-58 (2010). 

18. Mazeh, T. & Faigler, S. Detection of the ellipsoidal and the relativistic beaming 
effects in the CoRoT-3 lightcurve. Astron. Astrophys. 521, L59 (2010). 

19. Shporer, A. et al. A ground-based measurement of the relativistic beaming 
effect in a detached double white dwarf binary. Astrophys. J. 725, L200-L204 
(2010). 

20. Djorgovski, S. G. et a/. Exploring the variable sky with the Catalina Real-time 
Transient Survey. In The First Year of MAXI: Monitoring Variable X-ray Sources 
(eds Mihara, T. & Serino, M.) 32 (MAXI, 2010). 

21. Volonteri, M., Haardt, F. & Madau, P. The assembly and merging history of 
supermassive black holes in hierarchical models of galaxy formation. Astrophys. J. 
582, 559-573 (2003). 

22. D'Orazio, D. J., Haiman, Z., Duffell, P., Farris, B. D. & MacFadyen, A. |. A reduced 
orbital period for the supermassive black hole binary candidate in the quasar PG 
1302-102? Mon. Not. R. Astron. Soc. 452, 2540-2545 (2015). 

23. Wright, J. T. & Gaudi, B. S. in Planets, Stars and Stellar Systems Vol. 3 (eds Oswalt, 
T. D. et al.) 489-540 (Springer, 2013). 

24. Kelly, B.C., Bechtold, J. & Siemiginowska, A. Are the variations in quasar optical flux 
driven by thermal fluctuations? Astrophys. J. 698, 895-910 (2009). 

25. MacFadyen, A. |. & Milosavljevic, M. An eccentric circumbinary accretion 
disk and the detection of binary massive black holes. Astrophys. J. 672, 83-93 
(2008). 

26. Charisi, M., Bartos, |., Haiman, Z., Price-Whelan, A. & Marka, S. Multiple periods in 
the variability of the supermassive black hole binary candidate quasar PG1302- 
102? Mon. Not. R. Astron. Soc. Lett. (in the press). 


Acknowledgements The authors thank M. Graham, J. Halpern, A. Price-Whelan, 

J. Andrews, M. Charisi, E. Quataert, and B. Kocsis for discussions. We also thank 

M. Graham for providing the optical data in electronic form. This work was supported by 
the National Science Foundation Graduate Research Fellowship under grant no. 
DGE1144155 (DJ.D.) and by the NASA grant NNX11AE05G (Z.H.). 


Author Contributions Z.H. conceived and supervised the project, performed the orbital 
velocity calculations, and wrote the first draft of the paper. D.J.D. computed the 
emission models and performed the fits to the observed light curve. D.S. analysed the 
archival UV data. All authors contributed to the text. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to Z.H. (zoltan@astro.columbia.edu). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 353 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

V-band emission from a three-component system in PG 1302-102. Here we 
assume that the PG 1302-102 supermassive black-hole (SMBH) binary system 
includes three distinct luminous components: a circumbinary disk (CBD), as well 
as actively accreting primary and secondary SMBHs. The optical brightness of each 
of the three components can be estimated once their accretion rates and the black- 
hole masses M, and Mb are specified. Using the absolute V-band magnitude of 
PG 1302-102, My = —25.81, and applying a bolometric correction BC ~ 10 (ref. 
27), we infer a total bolometric luminosity of Ly.) = 6.5(BC/10) x 10*° erg s). 
Bright quasars with the most massive SMBHs (M = 10°Mo), have a typical radi- 
ative efficiency of ¢ = 0.3 (ref. 28). Adopting this value, the implied accretion rate is 
M=Lho /(€c?)=3.7Meyr—! (where the overdot indicates differentiation with 
respect to time). 

We identify this as the total accretion rate through the CBD, and require that at 
small radii, the rate is split between the two black holes M=M, +M)j with the 
ratio y= M,/M,. Hydrodynamic simulations" have found that the secondary 
black hole captures the large majority of the gas, with 10<7<20 for 
0.03 < q $0.1 (where q = M)/M,). Defining the Eddington ratio of the ith disk 
as its accretion rate scaled by its Eddington-limited rate fi; gaa =M,;+M ida With 
Mead = Lega /(0.1c) (here Lega is the Eddington luminosity for the ith black hole, 
and we have adopted the fiducial radiative efficiency of € = 0.1 to be consistent 
with the standard definition in the literature), we have 


M -1 
= 0.068 ——__— 
fcsp.Edd ( 10°4 Ti) 
1+q cBp,Edd \f1+q\/ 21 
= “4 = 0,0034{- 
fiszaa =Jomv.ead 7 ( 0.068 \e 1+n 


fizaa figaa \/ 1 \ (0.05 
att ~1.37( #4 \(-)( 
feea=M q 0.0034 oo ”q 


where the subscripts 1 and 2 refer to the primary and the secondary black holes, 
respectively. We adopt a standard, radiatively efficient, geometrically thin, optic- 
ally thick Shakura-Sunyaev disk model” to compute the luminosities produced 
in the CBD and the circumsecondary disk (CSD). Although the secondary 
black hole is accreting at a super-Eddington rate, recent three-dimensional radi- 
ation magnetohydrodynamic simulations of super-Eddington accretion find 
radiative efficiencies comparable to the values in standard thin-disk models*”. 
On the other hand, the primary black hole is accreting below the critical rate 
My S Mapag © 0.027( i 0.3)’Mpga (ADAE, advection-dominated accretion flow; 
a is the viscosity parameter) at which advection dominates the energy balance*’. 
We therefore estimate the luminosity of the primary black hole from a radiatively 
inefficient ADAF*”, rather than from a Shakure-Sunyaev disk. This interpreta- 
tion is supported by the fact that PG 1302-102 is known to be an extended radio 
source, with evidence for a jet and bends in the extended radio structure™, features 
that are commonly associated with sub-Eddington sources”. 

For the radiatively efficient CBD and CSD, the frequency-dependent luminosity 
is determined by integrating the local, modified black-body flux over the area of 
the disk: 


an 


‘Rout 
Ly=2n | F, [Tp(r)] rdr 
Rin 


where 


2¢1/? 


T 5 o= 
1/2 Vo v b: 
1+el/ Kabs + 1¢¢8 


abs 
Ky 


B, is the Planck function, «°° is the frequency-dependent absorption opacity, «* is 


the electron scattering opacity, and r € (Rin, Rout) is the radial coordinate, with 
Rin out the inner and outer radii of the appropriate disk. We compute the radial 
disk-photosphere temperature profile T,, by equating the viscous heating rate with 
the modified black-body flux: 


3GMM 1 a 


8nr3 r 


1/2 

4 
) | ((v,TpoTS 
where 
15[ 2c/?(x) xe-* hv 

x= 

m4 1+eV/?(~)1-e7* kpTp 
o is the Stefan—Boltzmann constant, kg is the Boltzmann constant, h is the 
Planck constant, and rjsco is the radius of the innermost stable circular orbit 


(ISCO). When solving for the photosphere temperature, we work in the limit that 
Keabs <—KES following appendix A of ref. 36, and we adopt r1sco =6GM;/c’, and 


C(v,Tp) 


(Rins Rout) = (2a, 200a) and (Rin, Rout) = ("p,1sco> a(q/3)'/?) for the CBD and CSD, 
respectively. Here the superscript i refers to the ith disk, a is the binary separation, 
6GM is the location of the ISCO for a Schwarzschild black hole (our results are 
insensitive to this choice) and a(q/ 3)" is the Hill radius of the secondary black 
hole (which provides an upper limit on the size of the CSD”). 

The optical luminosity of an ADAF is sensitive to the assumed microphysical 
parameters and its computation is more complicated than that for a thin disk. Here 
we first compute a reference thin-disk luminosity Lss (SS, Shakure-Sunyaev) for 
the primary black hole, and multiply it by the ratio of the bolometric luminosity of 
an ADAF to an equivalent thin-disk luminosity from ref. 32: 


L M/M, -2 
ADAF = 0.008 /. Edd (5) 
Iss 0.0034 0.3 


For calculating the reference Lgs, we adopted parameters that are consistent with 
ref. 32, in particular, ¢ = 0.1. Although the above ratio is for bolometric luminos- 
ities, we find that it agrees well with the factor-of-100 difference in the V band 
shown in figure 6 of ref. 33, between ADAF and thin-disk spectra, with parameters 
similar to PG 1302-102 (10°Mo, M=Mapar=10~!>Mgga, & ~ 0.3). 

Extended Data Figure 1 shows the thin-disk CBD and CSD spectra for a total 
Eddington ratio of fcgp,.zaa = 0.07, consistent with the high-mass estimates for 
PG 1302-102 that are needed for the beaming scenario (M=10°4Mo and 
q = 0.05). The red dot shows the reduced V-band luminosity of an ADAF onto 
the primary. The secondary clearly dominates the total V-band luminosity, with 
the primary contributing less than 1%, and the CBD contributing approxi- 
mately14%. In practice, the contribution from the CBD becomes non-negligible 
only for the smallest binary masses and lowest mass ratios (reaching 20% for 
M<10°Mo and q< 0.025). 

We compute the contributions of each of the three components to the total 

luminosity, LY,=LY+LY+L¢gp, and the corresponding total fractional- 
modulation amplitude ALY,/L,,,=(AL] +AL})/LY,, for each value of the 
total mass M and mass ratio q. The primary is assumed to be Doppler modulated 
with a line-of-sight velocity v, = —qv2, whereas the emission from the CBD is 
assumed to be constant over time (ALY, =0). Extended Data Figure 2 shows 
regions in (M, q,i) parameter space where the total luminosity variation due to 
relativistic beaming exceeds 14%. This figure recreates Fig. 1 of the main text, but 
using the luminosity contributions computed self-consistently in the above model, 
rather than assuming a constant value of f,. Because the secondary is found to be 
dominant, the relativistic-beaming scenario is consistent with a wide range of 
binary parameters. 
Model fitting to the PG 1302-102 optical light curve. We fitted models to 
the observed light curve of PG 1302-102 by maximizing the Bayesian likeli- 
hood Loc det |Cov?Cov?h|~!/? exp(—2/2), where =Y1(Cov)"'Y and 
Y= O- Mis the difference vector between the mean flux predicted in a model 
and the observed flux at each observation time t;. Here Cov is the covariance 
matrix of flux uncertainties, allowing for correlations between fluxes measured 
at different t;. We include two types of uncertainties: (1) random (uncorrelated) 
measurement errors 


O° sed 
a 0 ij 
where o7 is the variance in the photometric measurement for the ith data point (as 
i P. P 


reported in ref. 6); and (2) correlated noise due to intrinsic quasar variability, with 
covariance between the ith and jth data points 


—|ti-t| 
Covi; =o7 exp ot 

The parameters op and Tp determine the amplitude and rest-frame coherence 
time, respectively, of correlated noise described by the damped random walk 
model, and the factor of (1 + z) converts tp to the observer’s frame where the 
t; are measured. The normalization of the Bayesian likelihood depends on these 
parameters, and therefore the normalization must be included when maximizing 
the likelihood over these parameters**. The covariance matrix for the total noise is 
given by Cov = Cov? + Cov’, We assume both types of noise are Gaussian, 
which provides a good description of observed quasar variability”. 

We then fit the following four different types of models to the data. 
(1) A relativistic beaming model with 5 + 2 = 7 model parameters: eccentricity e, 
argument of pericentre w, amplitude K, phase fg, and orbital period P, as well as the 
two noise parameters op and Tp. 
(2) An accretion rate model with 3 + 2 = 5 model parameters: amplitude K, phase 
to, and period P, as well as the two noise parameters dp and Tp. This model 
assumes that the light curve of PG 1302-102 tracks the mass-accretion rates that 
are predicted in hydrodynamic simulations. For near-equal-mass binaries, several 


©2015 Macmillan Publishers Limited. All rights reserved 


studies have found that the mass-accretion rates fluctuate periodically, but 
resemble a series of sharp bursts, unlike the smoother, sinusoid-like shape of 
the light curve of PG 1302-102. To our knowledge, only three studies so far have 
simulated unequal-mass (q = 0.1) SMBH binaries'’*>. The accretion rates for 
these binaries are less ‘bursty’; among all of the cases in these three studies, the 
q = 0.075 and q = 0.1 binaries in ref. 11 resemble the light curve of PG 1302-102 
most closely (see Extended Data Fig. 3). Here we adopt the published accretion 
curve for q = 0.1, and perform a fit to PG 1302-102 by allowing an arbitrary linear 
scaling in time and amplitude, as well as a shift in phase; this gives us the three free 
parameters for this model. (We find that the q = 0.075 case provides a worse fit.) 
(3) A sinusoid model with 3 + 2 = 5 parameters: amplitude K, phase fo, and period 
P, as well as the two noise parameters dp and tp. This model is equivalent, to first 
order in v/c, to the beaming model restricted to a circular binary orbit. 

(4) A constant luminosity model with 2 parameters, the noise parameters op and 
Tp. This model is for reference only, to quantify how poor the fit is with only these 
parameters. 

In each of these models, we fixed the mean flux to correspond to the mean 
magnitude M that is inferred from the optical data; allowing the mean to be an 
additional free parameter did not change our results. The highest maximum like- 
lihood is found for the beaming model, with best-fit values of P= 1,996 +32 days, 
K=0.065 70007, e= 0.094007, cos(w)= —0.65*)2., and ty) =7181 4? days, 
where the reference point fy is measured from MJD — 49,100. Uncertainties are 
computed with the ‘emcee’ code*’, which implements a Markov chain Monte Carlo 
algorithm, and which we use to sample the seven-dimensional posterior probability 
of the model given the data in ref. 6. We use 28 individual chains to sample the 
posterior for 1,024 steps each. Throwing away the first 600 steps (‘burning in’), we 
run for 424 steps and for each parameter we quote best-fit values corresponding to 
the maximum posterior probability, with errors given by the 85th and 15th per- 
centile values (marginalized over the other six parameters). The best-fit noise 
parameters are (op, Tp) =(0.0491)-0\° mag, 37.6 *}% days). The best-fit model 
has a reduced y7/(N — 1 — 7) ~ 2.1, where N = 245 is the number of data points. 

To assess which of the models is favoured by the data, we use the Bayesian 
information criterion (BIC), a standard method for comparing different models 
that penalizes models with a larger number of free parameters*’. Specifically, 
BIC= —21n(£)+kIn(N), where the first term is evaluated using the best-fit 
parameters in each of the models and k is the number of model parameters. We 
find the following differences ABIC between pairs of models: 

BICacc — BICgeam = 4.0 (the beaming model is preferred over the accretion 
model); 

BICacc — BICsin = 14.9 (the sinusoid model is strongly preferred over the accre- 
tion model); 

BICgin — BICgeam = —10.9 (the sinusoid model is strongly preferred over the 
beaming model); 

BlICconst — BICgeam = 11.5 (the beaming model is strongly preferred over pure 
noise); and 

BlCconst — BICsin = 22.4 (the sinusoid model is strongly preferred over pure 
noise). 

We conclude that a sinusoid, or equivalently the beaming model restricted to a 
circular binary, is the preferred model. This model is very strongly favoured over 
the best-fit accretion model (see Extended Data Fig. 3), with ABIC > 14.9. For the 
assumed Gaussian distributions, this corresponds to an approximate likelihood 
ratio of exp(—14.9/2) ~ 5.7 X 10-*. Although our best-fit beaming model has a 
small non-zero eccentricity, the seven-parameter eccentric model is disfavoured 
(by ABIC = 11.5) over the five-parameter circular case. 

We conservatively allowed the amplitude of accretion-rate fluctuations to be a 

free parameter in the accretion models, but we note that the accretion-rate vari- 
ability measured in hydrodynamic simulations exhibits large (order one) devia- 
tions from the mean, even for 0.05 < q < 0.1 binaries'''*"’. In the accretion-rate 
model, an additional physical mechanism needs to be invoked to damp the fluc- 
tuations to the smaller, approximately 14% amplitude seen in PG 1302-102 (such 
as a more substantial contribution from the CBD and/or the primary). 
Disk precession. The lowest BIC model, with a steady accretion rate and a relat- 
ivistic boost from a circular orbit, has a reduced a = 2.1, indicating that the 
relativistic-boost model with intrinsic noise does not fully describe the observed 
light curve. The residuals could be explained by a lower-amplitude periodic modu- 
lation in the mass-accretion rate, which is expected to have a non-sinusoidal shape 
(with sharper peaks and broader troughs, as mentioned above”’). Alternatively, the 
minidisks, which we implicitly assumed to be co-planar with the binary orbit, 
could instead have a substantial tilt’”. 

A circumsecondary minidisk that is tilted with respect to the orbital plane of the 
binary will precess around the binary angular-momentum vector, causing addi- 
tional photometric variations due to the changing projected area of the disk on the 
sky. The precession timescale can be estimated from the total angular momentum 


LETTER 


of the secondary disk and the torque exerted on it by the primary black hole. The 
ratio of the precession period to the orbital period of the binary is” 


Porec _—_ 8 vit+4 
Porb ¥/3 cos (0) 


where 6 € (—11/2, n/2) is the angle between the angular-momentum vectors of the 
disk and the binary, and we have chosen the outer edge of the minidisk to coincide 
with the Hill sphere of the secondary black hole Ry = (q/3)'%a, for binary semi- 
major axis a. This choice gives the largest secondary disk and the shortest preces- 
sion rates. For small binary mass ratios, consistent with the relativistic beaming 
scenario, the precession can be as short as 4.8P.,5, which causes variations on a 
timescale that spans the current observations of PG 1302-102. The precession 
timescale would be longer (>20Po:p) for a smaller secondary disk that is tidally 
truncated at 0.27q°°a (ref. 43), and with a more inclined (45°) disk. 

Archival UV data. FUV (0.14-0.175 jum) and NUV (0.19-0.27 um) spectra of 
PG 1302-102 were obtained by the HST and the GALEX since 1992. HST FOS 
NUV spectra were obtained on 17 July 1992 (pre-COSTAR)“*. HST STIS FUV 
spectra were obtained on 21 August 2001 (ref. 45). GALEX FUV and NUV spectra 
were obtained on 8 March 2008 and 6 April 2009, and HST COS FUV spectra were 
obtained on 28 January 2011. All data are publicly available through the Mikulski 
Archive for Space Telescopes at http://archive.stsci.edu. All measurements were 
spectrophotometrically calibrated, and binned or smoothed to a resolution of 
1-3 A. The spectra (Fig. 2) have errors per bin that are typically less than 2%; 
published absolute photometric accuracies are better than 5%. 

From each spectrum, average flux measurements (Fig. 2) were obtained in one 
or both of two discrete bands: FUV continuum (0.145-0.1525 jim; a range chosen 
to avoid the Lya line) and NUV continuum (0.20-0.26 jim). For the GALEX NUV 
photometric data (also used in Fig. 2) we adopted a small correction (0.005 mag) 
for the transformation from the GALEX NUV to our NUV continuum band. 
GALEX FUV photometric data were not used because of the substantial contri- 
bution from redshifted Lyo. The broad lines in the UV spectra (in Fig. 3) do not 
show a large Doppler shift (AZ = (v/c)A ~ 140 A). This is unsurprising, because 
the broad line widths (2,500-4,500 kms‘) are much smaller than the inferred 
relativistic line-of-sight velocities, and are expected to be produced by gas at larger 
radii, unrelated to the rapidly orbiting minidisks that produce the featureless 
thermal continuum emission”. 


27. Richards, G.T.eta/. Spectral energy distributions and multiwavelength selection of 
type 1 quasars. Astrophys. J. 166 (Suppl.), 470-497 (2006). 

28. Yu, Q.& Tremaine, S. Observational constraints on growth of massive black holes. 
Mon. Not. Astron. R. Soc. 335, 965-976 (2002). 

29. Shakura, N. |. & Sunyaev, R. A. Black holes in binary systems. Observational 
appearance. Astron. Astrophys. 24, 337-355 (1973). 

30. Jiang, Y.-F., Stone, J. M. & Davis, S. W. A global three-dimensional radiation 
magneto-hydrodynamic simulation of super-Eddington accretion disks. 
Astrophys. J. 796, 106 (2014). 

31. Narayan, R. & McClintock, J. E. Advection-dominated accretion and the black hole 
event horizon. New Astron. Rev. 51, 733-751 (2008). 

32. Mahadevan, R. Scaling laws for advection-dominated flows: applications to low- 

luminosity galactic nuclei. Astrophys. J. 477, 585-601 (1997). 

33. Narayan, R., Mahadevan, R. & Quataert, E. in Theory of Black Hole Accretion Disks 

(eds Abramowicz, M. A. et al.) 148-182 (Cambridge Univ. Press, 1998). 

34. Hutchings, J.B., Morris, S.C., Gower, A.C. & Lister, M.L. Correlated optical and radio 

ructure in the QSO 1302-102. Publ. Astron. Soc. Pac. 106, 642-645 (1994). 

35. Wang, J.-M., Ho, L. C. & Staubert, R. The central engines of radio-loud quasars. 

stron. Astrophys. 409, 887-898 (2003). 

36. Tanaka, T. & Menou, K. Time-dependent models for the afterglows of massive 

black hole mergers. Astrophys. J. 714, 404-422 (2010). 

37. Artymowicz, P. & Lubow, S. H. Dynamics of binary-disk interaction. 1. Resonances 

and disk gap sizes. Astrophys. J. 421, 651-667 (1994). 

38. Koztowski, S. et a/. Quantifying quasar variability as part of a general approach to 

assifying continuously varying sources. Astrophys. J. 708, 927-945 (2010). 

39. Andrae, R., Kim, D.-W. & Bailer-Jones, C. A. L. Assessment of stochastic and 
deterministic models of 6304 quasar lightcurves from SDSS Stripe 82. Astron. 
Astrophys. 554, A137 (2013). 

40. Foreman-Mackey, D., Hogg, D. W., Lang, D. & Goodman, J. emcee: the MCMC 

hammer. Publ. Astron. Soc. Pac. 125, 306-312 (2013). 

41. Kass, R. E. & Raftery, A. E. Bayes factors. J. Am. Stat. Assoc. 90, 773-795 (1995). 

42. Lai,D.Star-disc-binary interactions in protoplanetary disc systems and primordial 

spin-orbit misalignments. Mon. Not. R. Astron. Soc. 440, 3532-3544 (2014). 

43. Roedig, C., Krolik, J. H. & Miller, M. C. Observational signatures of binary 

supermassive black holes. Astrophys. J. 785, 115 (2014). 

44. Evans, |. N. & Koratkar, A. P.A complete atlas of recalibrated Hubble Space 

Telescope Faint Object Spectrograph spectra of active galactic nuclei and auasars. 

|. Pre-COSTAR spectra. Astrophys. J. 150 (Suppl.), 73-164 (2004). 

45. Cooksey, K. L., Prochaska, J. X., Chen, H.-W., Mulchaey, J. S. & Weiner, B. J. 
Characterizing the low-redshift intergalactic medium toward PKS 1302-102. 
Astrophys. J. 676, 262-285 (2008). 


n 


> 


io} 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


ae Peres a. 4 a a ver a we n ay ay Yas var 


10!3 1014 ; 1015 1016 1017 


Extended Data Figure 1 | Model spectrum of PG 1302-102. Circumbinary approximate flux from an advection-dominated accretion flow (ADAF) is 
(dashed blue) and circumsecondary (solid black) disk spectra for a total binary — shown as a red dot for the V-band contribution of the primary. The spectrum 
mass of 10°“Mo, binary mass ratio of q = 0.05, and ratio of accretion rates for a radiatively efficient, thin disk around the primary is shown by the thin red 
Mp / M, =20. A vertical dashed line marks the centre of the V band and the dashed curve for reference. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


i (From Face-On) 


10° 10°: 1 10° 10°3 10°47 
M/M., 


Extended Data Figure 2 | Parameter combinations for which the combined _ luminosities of each of the three components are computed from a model: the 
V-band luminosity of the three-component system varies by the required luminosity of the primary is assumed to arise from an ADAF, whereas the 


0.14 mag. M is the binary mass, q is the mass ratio, and i is the orbital luminosity of the secondary is generated by a modestly super-Eddington thin 
inclination angle. This figure is analogous to Fig. 1, except instead of adopting disk. Emission from the circumbinary disk is also from a thin disk, and is 
an ad-hoc fractional luminosity contribution f, by the secondary, the negligible except for binaries with the lowest mass ratio q < 0.01 (see text). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


14.4 4 14.4 
| 
14.6 + 2 14.6 
I 
" / £ \ é 1 = W 
a \ Van y ‘ a / av 
2 , tH ; 5 
= 14.8f iH 3 ; 4 2148 
5 eh a, 1 & 
& % afi \ 1 % I] \ By S 
= = ; x 4 \ =] N = 
— 7% “a7 ~~ SS i +t 
f t = $ 
15.0 } I 1 | 15.0 
} tf 
15.2 4 15.2 
ss Os es Se es a a | 
0 1000 2000 3000 4000 5000 6000 7000 0 1000 2000 3000 4000 5000 6000 7000 
MJD-49100 MJD-49100 


adopted from hydrodynamic simulations"! (blue dashed curves) for a q = 0.075 
(left) and a q = 0.1 (right) binary. The grey points with 1o error bars are the 
data for PG 1302-102 (ref. 6). 


Extended Data Figure 3 | Model fits to the optical light curve of PG 1302-102. 
Best-fit curves assuming relativistic boost from a circular binary (solid black 
curves), a pure sinusoid (red dotted curves), and accretion rate variability 


©2015 Macmillan Publishers Limited. All rights reserved 


od se 


doi:10.1038/nature14889 


Spawning rings of exceptional points out of 


Dirac cones 


Bo Zhen'*, Chia Wei Hsu'*, Yuichi Igarashi’**, Ling Lu’, Ido Kaminer', Adi Pick'*, Song-Liang Chua’, 


John D. Joannopoulos! & Marin Soljaci¢! 


The Dirac cone underlies many unique electronic properties of 
graphene’ and topological insulators, and its band structure— 
two conical bands touching at a single point—has also been rea- 
lized for photons in waveguide arrays’, atoms in optical lattices’, 
and through accidental degeneracy**. Deformation of the Dirac 
cone often reveals intriguing properties; an example is the 
quantum Hall effect, where a constant magnetic field breaks the 
Dirac cone into isolated Landau levels. A seemingly unrelated phe- 
nomenon is the exceptional point®’, also known as the parity-time 
symmetry breaking point*”™', where two resonances coincide in 
both their positions and widths. Exceptional points lead to 
counter-intuitive phenomena such as loss-induced transparency”, 
unidirectional transmission or reflection’!*™, and lasers with 
reversed pump dependence”’ or single-mode operation’®’’. Dirac 
cones and exceptional points are connected: it was theoretically 
suggested that certain non-Hermitian perturbations can deform 
a Dirac cone and spawn a ring of exceptional points’*”°. Here we 
experimentally demonstrate such an ‘exceptional ring’ in a photo- 
nic crystal slab. Angle-resolved reflection measurements of the 
photonic crystal slab reveal that the peaks of reflectivity follow 
the conical band structure of a Dirac cone resulting from accidental 
degeneracy, whereas the complex eigenvalues of the system are 
deformed into a two-dimensional flat band enclosed by an excep- 
tional ring. This deformation arises from the dissimilar radiation 
rates of dipole and quadrupole resonances, which play a role ana- 
logous to the loss and gain in parity-time symmetric systems. Our 
results indicate that the radiation existing in any open system can 
fundamentally alter its physical properties in ways previously 
expected only in the presence of material loss and gain. 

Closed and lossless physical systems are described by Hermitian 
operators, which guarantee realness of the eigenvalues and a complete 
set of eigenfunctions that are orthogonal to each other. On the other 
hand, systems with open boundaries’ or with material loss and 
gain”'””° are non-Hermitian®, and have non-orthogonal eigenfunc- 
tions with complex eigenvalues where the imaginary part corresponds 
to decay or growth. The most drastic difference between Hermitian 
and non-Hermitian systems is that the latter exhibit exceptional points 
(EPs) where both the real and the imaginary parts of the eigenvalues 
coalesce. At an EP, two (or more) eigenfunctions collapse into one so 
the eigenspace no longer forms a complete basis, and this eigenfunc- 
tion becomes orthogonal to itself under the unconjugated ‘inner prod- 
uct’. To date, most studies of the EP and its intriguing consequences 
concern parity-time symmetric systems that rely on material loss and 
gain”'””’, but EPs are a general property that require only non- 
Hermiticity. Here, we show the existence of EPs in a photonic crystal 
slab with negligible absorption loss and no artificial gain. When a 
Dirac-cone system has dissimilar radiation rates, the band structure 
is altered abruptly to show branching features with a ring of EPs. We 


provide a complete picture of this system, ranging from an analytic 
model and numerical simulations to experimental observations; taken 
together, these results illustrate the role of radiation-induced non- 
Hermiticity that bridges the study of EPs and the study of Dirac cones. 

We start by showing that non-Hermiticity from radiation can 
deform an accidental Dirac point into a ring of EPs. First, consider a 
two-dimensional photonic crystal (Fig. 1a inset), where a square lattice 
(periodicity a) of circular air holes (radius r) is introduced in a dielec- 
tric material. This is a Hermitian system, as there is no material gain or 
loss and no open boundary for radiation. By tuning a system parameter 
(for example, r), one can achieve accidental degeneracy between a 
quadrupole mode and two degenerate dipole modes at the I’ point 
(centre of the Brillouin zone), leading to a linear Dirac dispersion due 
to the anti-crossing between two bands with the same symmetry*”. 
The accidental Dirac dispersion from the effective Hamiltonian model 
(see equation (1) below with yg = 0) is shown as solid lines in Fig. 1a, 
agreeing with numerical simulation results (symbols). In the effective 
Hamiltonian we do not consider the dispersionless third band (grey 
line) owing to symmetry arguments (Supplementary Information sec- 
tion I), although this third band cannot be neglected in certain calcula- 
tions, including the Berry phase and effective medium properties”. 

Next, we consider a similar, but open, system: a photonic crystal slab 
(Fig. 1b inset) with finite thickness h. With the open boundary, modes 
within the radiation continuum become resonances because they radi- 
ate by coupling to extended plane waves in the surrounding medium. 
Non-Hermitian perturbations need to be included in the Hamiltonian 
to account for the radiation loss. To the leading order, radiation of the 
dipole mode can be described by adding an imaginary part — iy to the 
Hamiltonian, while the quadrupole mode does not radiate owing to its 
symmetry mismatch with the plane waves™. Specifically, at the F' point 
the system has C) rotational symmetry (invariant under 180° rotation 
around the z axis), and the quadrupole mode does not couple to the 
radiating plane wave because the former has a field profile E(r) that is 
even under C, rotation, E(r) = Oc, E(r), whereas the latter is odd, 
E(r)= — Oc, E(r). The effective Hamiltonian is 


Has =( Wo V¢|K| ) (1) 


Ve|k| wo —iyg 


with complex eigenvalues 


Oy = 0-17 + vey/ [KP —K (2) 


where (po is the frequency at accidental degeneracy, v, is the group 
velocity of the linear Dirac dispersion in the absence of radiation, || is 
the magnitude of the in-plane wavevector (k,, ky), and k= /2Vg. 
Here, one of the three bands is decoupled from the other two and 
is not included in equation (1) (see Supplementary Information sec- 
tion II). In equation (2), a ring defined by |.k| = k, separates the k space 


Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA. @Department of Applied Physics, Yale University, New Haven, Connecticut 06520, USA. 
3Smart Energy Research Laboratories, NEC Corporation, 34 Miyuiga-ka, Tsukuba, Ibaraki 305-8501, Japan. “Department of Physics, Harvard University, Cambridge, Massachusetts 02138, USA. 5DSO 


National Laboratories, 20 Science Park Drive, Singapore 118230, Singapore. 
*These authors contributed equally to this work. 


354 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Hermitian Non-Hermitian 
a b c 
1S} 
g q 3 
N x oO 
o 3s = 
3 g “ 
e = q 
s Bs 
g 2 E 
rm © £ 
Ww 
0.01 0 0.01 0.01 0 0.01 
|kla/2r |kla/27 
oO 
9 & 0 
eB 0.55 s % 
Cal 2 0.60 = 
@ S ae 
3 ® Oo” 
~ a e 
2 = N 
6 o & -2 
2 6S SS 
© 0.54 sg = £ Sa 
uo a sa © 0.59 34: aA 
0.01 OOM ac Be 0.01 


0ST G01 y 
Kalan ~ “Kalen 
Figure 1 | Accidental degeneracy in Hermitian and non-Hermitian 
photonic crystals. a, Band structure of a two-dimensional photonic crystal 
consisting of a square lattice of circular air holes. Tuning the radius r leads to 
accidental degeneracy between a quadrupole band and two doubly degenerate 
dipole bands, resulting in two bands with linear Dirac dispersion (red and blue) 
and a flat band (grey). b, ¢, The real (b) and imaginary (c) parts of the 
eigenvalues of an open, and therefore non-Hermitian, system: a 
photonic crystal slab with finite thickness, h. By tuning the radius, accidental 
degeneracy in the real part can be achieved, but the Dirac dispersion is 
deformed owing to the non-Hermiticity. The analytic model predicts that the 


into two regions: inside the ring (|k| < k.), Re(w+) are dispersionless 
and degenerate; outside the ring (|k| > k,), Im(w--) are dispersionless 
and degenerate. In the vicinity of k,, Im(@+) and Re(w-+) exhibit 
square-root dispersion (also known as branching behaviour‘) inside 
and outside the ring, respectively. Exactly on the ring (|k| = k,), the 
two eigenvalues « are degenerate in both real and imaginary parts; 
meanwhile, the matrix Her becomes defective with an incomplete 
eigenspace spanned by only one eigenvector (1, —i)" that is orthogonal 
to itself under the unconjugated ‘inner product’, given by a'b for 
vectors aand b. This self-orthogonality is the definition of EPs; hence, 
here we have not just one EP, but a continuous ring of EPs. We call it an 
exceptional ring. 

Figure 1b, c shows the complex eigenvalues of the photonic crystal 
slab structure calculated numerically (symbols), which closely follow 
the analytic model of equation (2) shown as solid lines in the figure. In 
Supplementary Fig. 1, we show that the two eigenvectors indeed 
coalesce into one at the EP, which is impossible in Hermitian systems 
(also see Supplementary Information section III). When the radius r of 
the holes is tuned away from accidental degeneracy, the exceptional 
ring and the associated branching behaviour disappear, as shown in 
Supplementary Fig. 2. Several properties of the photonic crystal slab 
contribute to the existence of this exceptional ring. Owing to peri- 
odicity, one can probe the dispersion from two degrees of freedom, 
k, and k,, in just one structure. The open boundary provides radiation 
loss, and the C; rotational symmetry differentiates the radiation loss of 
the dipole mode and of the quadrupole mode. 

We can rigorously show that the exceptional ring exists in realistic 
photonic crystal slabs, not just in the effective Hamiltonian model. Our 
proof is based on the unique topological property of EPs: when the 
system parameters evolve adiabatically along a loop encircling an EP, 
the two eigenvalues switch their positions when the system returns to 
its initial parameters’*'”», in contrast to the typical case where the two 
eigenvalues return to themselves. Using this property, we numerically 


) a 0) 
0.01 0.01 kal2n 0.01 0.01 a/on 
y: 


0 
kal2n 


real (imaginary) part of the eigenvalue stays as a constant inside (outside) a ring 
in the wavevector space, indicating two flat bands in dispersion, with a ring of 
exceptional points (EPs) where both the real and the imaginary parts are 
degenerate. The orange shaded regions correspond to the inside of the ring. In 
the upper panels of a-c, solid lines are predictions from the analytic model and 
symbols are from numerical simulations: red squares represent the band 
connecting to the quadrupole mode at the centre; blue circles represent the 
band connecting to the dipole mode at the centre; and grey crosses represent the 
third band that is decoupled from the previous two due to symmetry. The three- 
dimensional plots in the lower panels are from simulations. 


show, in Supplementary Fig. 3 and Supplementary Information sec- 
tion IV, that the complex eigenvalues always switch their positions 
along every direction in the k space, and therefore prove the existence 
of this exceptional ring. As opposed to the simplified effective Hamil- 
tonian model, in a real photonic crystal slab, the EP may exist at a 
slightly different magnitude of k and for a slightly different hole radius 
r along different directions in the k space, but this variation is small and 
negligible in practice (Supplementary Information section V). 

To demonstrate the existence of the exceptional ring in such a 
system, we fabricate large-area periodic patterns in a Si;N, slab 
(n = 2.02 in the visible spectrum, thickness 180 nm) on top of 6 tm 
of silica (n = 1.46) using interference photolithography”. Scanning 
electron microscope (SEM) images of the sample are shown in 
Fig. 2a, featuring a square lattice (periodicity a = 336 nm) of cylin- 
drical air holes with radius 109 nm. We immerse the structure into an 
optical liquid with a specified refractive index that can be tuned; acci- 
dental degeneracy in the Hermitian part is achieved when the liquid 
index is selected to be n = 1.48. We perform angle-resolved reflectivity 
measurements (set-up shown in Fig. 2b) between 0° and 2° along the 
T= X direction and the [> M direction, for both s and p polariza- 
tions. Details of the sample fabrication and the experimental setup can 
be found in Supplementary Information section VI. The measured 
reflectivity for the relevant polarization is plotted in the upper panel 
of Fig. 2c, showing good agreement with numerical simulation results 
(lower panel), with differences coming from scattering due to surface 
roughness, inhomogeneous broadening, and the uncertainty in the 
measurements of system parameters. The complete experimental 
result for both polarizations is shown in Supplementary Fig. 4; the 
third and dispersionless band shows up in the other polarization, 
decoupled from the two bands of interest. 

The peaks of reflectivity (dark red colour in Fig. 2c) follow the linear 
Dirac dispersion; this feature disappears for structures with different 
radii that do not reach accidental degeneracy (experimental results in 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 355 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


p-polarized 


s-polarized d 
M <—— Tr —~> xX 


560 
a 
= 562 m 
iS $ ao 
& c i 
s $s sn 
2 564 2 3x | ° 
= . 
zs s =§ 2 © (one/a) 
> o e 
& S Q 
= 566 a 
568 
0=0.1 0.31 0.8 = 
560 a 
Sx 
Ew 
= 562 xz 8 
= 2 x 
5 : 
© 564 < 
oO 
@ 5 
> 
w iS) 
a 
= 566 5 « 
e 
=] 
=n 
568 Sx 
1 05 Eg 
Angle (degrees) & 


0 


Figure 2 | Experimental reflectivity spectrum and accidental Dirac 
dispersion. a, SEM images of the photonic crystal samples: side view (upper 
panel) and top view (lower panel). b, Schematic drawing of the measurement 
set-up. Linearly polarized light from a super-continuum source is reflected 
off the photonic crystal slab (‘sample’) immersed in an optical liquid, and 
collected by a spectrometer (SP). The incident angle 0 is controlled using a 
precision rotating stage. BS, beam splitter. c, Reflectivity spectrum of the sample 
measured experimentally (upper panel) and calculated numerically (lower 
panel) along the ! — X and the = M directions. The peak location 

of reflectivity reveals the Hermitian part of the system, which forms Dirac 
dispersion due to accidental degeneracy. In the lower panel, white solid lines 


Supplementary Fig. 5). Note that the reflection peaks do not follow the 
real part of the complex eigenvalues of the Hamiltonian; in fact they 
follow the eigenvalues of the Hermitian part of the Hamiltonian, even 
though the Hamiltonian is non-Hermitian. To understand this, we 
consider a more general two-by-two Hamiltonian of a coupled res- 
onance system H and separate it into a Hermitian part A and an anti- 
Hermitian part —iB (A and B are both Hermitian) 


My, K f¥. Vy eigenvalues Oxy 0 
—i 2 (3) 
K 2 Y2 2 0 we 
a eee We ee” 


A iB 


H= 


As before, we use w+ to denote the complex eigenvalues of the 
Hamiltonian A — iB. Physically, matrix A describes a lossless system, 
and matrix —iB adds the effects of loss. In B, the diagonal elements are 
loss rates (in our system, they come primarily from radiation), and the 
off-diagonal elements arise from overlap of the two radiation patterns, 
also known as external coupling of resonances via the continuum. 
Modelling the reflectivity using temporal coupled-mode theory 
(TCMT), we show that when matrix B is dominated by radiation, 
the reflection peaks occur near the eigenvalues 2, of the Hermitian 
part A and are independent of the anti-Hermitian part —iB (see 
Supplementary Information section VII and Supplementary Fig. 6 
for details). Therefore, the linear Dirac dispersion observed in the 
measured data of Fig. 2c (dark red) indicates that we have successfully 
achieved accidental degeneracy in the eigenvalues of the Hermitian 
part, consistent with the simplified model in equation (1). In 
Supplementary Fig. 8b, we plot the values of 2,2 extracted from the 


356 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


1 


indicate the real part of the eigenvalues; spectra and eigenvalues at three 
representative angles (marked by dashed lines and circles) are shown in 

d. d, Three line cuts of reflectivity R from simulation results. Also shown are the 
complex eigenvalues (open circles) calculated numerically. At large angles 
(0.8°), the two resonances are far apart, so the reflectivity peaks (red arrows) are 
close to the actual positions of the complex eigenvalues. However, at small 
angles (0.3°, 0.1°), the coupling between resonances cause the resonance peaks 
(red arrows) to have much greater separations in frequency compared to the 
complex eigenvalues. The black arrows mark the dips in reflectivity that 
correspond to the coupled-resonator induced transparency (CRIT, see text 
for details). 


reflectivity data through a more rigorous data analysis using TCMT 
(described below); the linear dispersion is indeed observed. We note 
that when there is substantial non-radiative loss or material gain in the 
system, the reflection peaks no longer follow the eigenvalues of the 
Hermitian part (see Supplementary Information section VIII and 
Supplementary Fig. 7). 

The real part of the complex eigenvalues of the Hamiltonian, 
Re(w+), behave very differently from the reflectivity peaks. 
Simulation results (solid white lines in the lower panel of Fig. 2c) show 
Re(q..) is dispersionless at small angles with a branch-point singular- 
ity around 0.31°—consistent with the feature predicted by the simpli- 
fied Hamiltonian in equation (2). In Fig. 2d, we compare the 
reflectivity spectra from simulations (with peaks indicated by red 
arrows) with the corresponding complex eigenvalues at three repres- 
entative angles (0.8° in blue, 0.31° in green and 0.1° in magenta). At 
0.31°, the two complex eigenvalues are degenerate, indicating an EP; 
however, the two reflection peaks do not coincide since they represent 
the eigenvalues of only the Hermitian part of the Hamiltonian, which 
does not have degeneracy here. The dip in reflectivity between the two 
peaks (marked as black arrows in Figs 2 and 3) is the coupled-res- 
onator-induced transparency (CRIT) that arises from the interference 
between radiation of the two resonances”®, similar to electromagnet- 
ically induced transparency. 

Qualitatively, the peak locations of the measured reflectivity spec- 
trum reveal the eigenvalues of the Hermitian part, A, and the line- 
widths of the peaks reveal the anti- Hermitian part, —iB; diagonalizing 
A — iB yields the eigenvalues w:., as illustrated in equation (3). To 
be more quantitative, we use TCMT and account for both the direct 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Experiments -- CMT  b 
M <——_ | —~>X M <——_ T —>X 
1A 0 =0.8° 0.6 -——1—-——1— 7560 ——— ———0 
= Q FR 7 4 A 
N aa 
x 05 3S 0.598 &, ” 1562 s b \ OP as 
oe == 3 be ‘I da 2 of fon 2 
2 ie 2 xe } SB bore yoda 
sy fe) . L % = @ Oo 5e- OL a 
SX 5 © Rew) =, 0.596 aller “4564 3 Peo%o0 76 = 
£ $ (2nc/a) L L£ Oo = P gq 15 a 
a § é OK se, \ | x 
ad 3 0.594- Slee S ‘ - = 
c /f qree = Fx ee x, 7? 3 
A é k 4 —~— 4 2, 
1 CRIT 9-03 osgot—_1 a | poi 1 ios 
a | = 1 05 0 05 1 1 05 0 05 1 
= ; Angle (degrees) Angle (degrees) 
2 0 
=n 
= is =2 ©” Near EP Re() © 
= > T7C/a) ‘ 
& —— — fo 
N 
7 S be ' | L | 
1A | ‘ x 4 4 
0=0.1 | a) \ae, aL Ne 
wa & eo —~— ooy & 200-024 6 a 
x 05 a b \ J C Ta: i 
©, {o.se25 oso14 eS 1 | 
=n 9 S > = -2 g 4 2b 4 
=x 1» ° Rew) 4 rx 3 r—>M 
Es (2nc/a) — 14] 
2 0.592 0.6 0.592 0.6 
x Re(w)a/2nc Re(w)a/2nc 


Figure 3 | Experimental demonstration of an exceptional ring. a, Examples 
of reflection spectrum R from the sample at three different angles (0.8° blue, 
0.3° green and 0.1° magenta, solid lines) measured with s-polarized light along 
the I’ X direction (same setup as in numerical simulations shown in Fig. 2d), 
fitted with the TCMT expression (equation (S20) in Supplementary 
Information) (black dashed lines). At each angle, the positions of the complex 
eigenvalues extracted experimentally are shown as open circles. b, Complex 
eigenvalues extracted experimentally (symbols), with comparison to numerical 
simulation results (dashed lines) for both the real part (left panel) and the 


(non-resonant) and the resonant reflection processes including nearby 
resonances; the expression for reflectivity is given in Supplementary 
Information equation (S20), with the full derivation given in 
Supplementary Information section IX. Fitting the reflectivity curves 
with the TCMT expression gives us an accurate estimate of the matrix 
elements and the eigenvalues; this procedure is the same as our 
approach in ref. 27 except that here we additionally account for the 
coupling between resonances”*. Figure 3a compares the fitted and the 
measured reflectivity curves at three representative angles (with more 
comparison in Supplementary Fig. 8a); the excellent agreement shows 
the validity of the TCMT model. Underneath the reflectivity curves, we 
show the complex eigenvalues. The difference between numerically 
calculated reflectivity (Fig. 2d) and experimental results (Fig. 3a) stems 
from the non-radiative decay channels in our system, mostly due to 
scattering loss from the surface roughness”. 

Repeating the fitting procedure for the reflectivity spectrum mea- 
sured at different angles, we obtain the dispersion curves for all com- 
plex eigenvalues, which are plotted in Fig. 3b. Along both directions in 
k space (> X and [> M), the two bands of interest (shown in 
blue and red) exhibit the EP behaviour predicted in equation (2): for 
|k| < k, the real parts are degenerate and dispersionless; for |k| > k, the 
imaginary parts are degenerate and dispersionless; for |k| in the vicin- 
ity of k, branching features are observed in the real or imaginary part. 
In Fig. 3c, we plot the eigenvalues on the complex plane for both the 
TI — X and I — M directions. We can see that in both directions, the 
two eigenvalues approach each other and become very close at a cer- 
tain k point, which is a clear signature of the system being very close to 
an EP. 

We have shown that non-Hermiticity arising from radiation can 
significantly alter fundamental properties of the system, including the 
band structures and the density of states; this effect becomes most 


imaginary part (right panel). Red squares and dashed lines are used for the band 
with zero radiation loss at the I point, blue circles and dashed lines for the 
band with finite radiation loss at the I point, and grey crosses and dashed lines 
for the third band decoupled from the previous two owing to symmetry. 

The orange shaded regions correspond to the inside of the ring. ¢, Positions of 
the eigenvalues (red and blue dashed lines) approach and become very close to 
each other (indicated by the two brown arrows), demonstrating near-EP 
features in different directions in the momentum space and the existence of 
an exceptional ring. 


prominent near EPs. The photonic crystal slab described here provides 
a simple-to-realize platform for studying the influence of EPs on light- 
matter interaction, such as for single particle detection?’ and modu- 
lation of quantum noise. The two-dimensional flat band can also 
provide a high density of states and therefore high Purcell factors. 
The strong dispersion of loss in the vicinity of the [ point can improve 
the performance of large-area single-mode photonic crystal lasers”. 
The deformation into an exceptional ring is a general phenomenon 
that can also be achieved with material gain or loss and for Dirac points 
in other lattices’””°. Further studies could advance the understanding 
of the connection between the topological property of Dirac points*” 
and that of EPs* in general non-Hermitian wave systems, and our 
method could go beyond photons to phonons, electrons and atoms. 


Received 2 April; accepted 29 June 2015. 
Published online 9 September 2015. 


1. Castro Neto, A. H., Guinea, F., Peres, N. M. R., Novoselov, K. S. & Geim, A. K. The 
electronic properties of graphene. Rev. Mod. Phys. 81, 109-162 (2009). 

2. Rechtsman, M. C. etal. Strain-induced pseudomagnetic field and photonic landau 
levels in dielectric structures. Nature Photon. 7, 153-158 (2013). 

3. Tarruell, L, Greif, D., Uehlinger, T., Jotzu, G. & Esslinger, T. Creating, moving and 
merging Dirac points with a Fermi gas in a tunable honeycomb lattice. Nature 483, 
302-305 (2012). 

4. Huang, X., Lai, Y., Hang, Z. H., Zheng, H. & Chan, C. Dirac cones induced by 
accidental degeneracy in photonic crystals and zero-refractive-index materials. 
Nature Mater. 10, 582-586 (2011). 

5. Moitra, P. et al. Realization of an all-dielectric zero-index optical metamaterial. 
Nature Photon. 7, 791-795 (2013). 

6. Moiseyev, N. Non-Hermitian Quantum Mechanics (Cambridge Univ. Press, 2011). 

7. Rotter, |. A non-Hermitian Hamilton operator and the physics of open quantum 
systems. J. Phys. A 42, 153001 (2009). 

8. Bender, C. M. & Boettcher, S. Real spectra in non-Hermitian Hamiltonians having 
PT symmetry. Phys. Rev. Lett. 80, 5243-5246 (1998). 

9. Riiter, C. E. et al. Observation of parity-time symmetry in optics. Nature Phys. 6, 
192-195 (2010). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 357 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


10. 
11. 
12. 
13. 
14. 
15. 
16. 
17. 
18. 
19. 
20. 
21. 
22. 
23. 
24. 


25. 


Chong, Y., Ge, L.& Stone, A. D. PT-symmetry breaking and laser-absorber modes in 
optical scattering systems. Phys. Rev. Lett. 106, 093902 (2011). 

Regensburger, A. et al. Parity-time synthetic photonic lattices. Nature 488, 
167-171 (2012). 

Guo, A. et al. Observation of PT-symmetry breaking in complex optical potentials. 
Phys. Rev. Lett. 103, 093902 (2009). 

Lin, Z. et al. Unidirectional invisibility induced by PT-symmetric periodic 
structures. Phys. Rev. Lett 106, 213901 (2011). 

Peng, B. et al. Parity-time-symmetric whispering-gallery microcavities. Nature 
Phys. 10, 394-398 (2014). 

Liertzer, M. et al. Pump-induced exceptional points in lasers. Phys. Rev. Lett. 108, 
173901 (2012). 

Hodaei, H., Miri, M.-A., Heinrich, M., Christodoulides, D. N. & Khajavikhan, M. Parity- 
time-symmetric microring lasers. Science 346, 975-978 (2014). 

Feng, L., Wong, Z. J., Ma, R.-M., Wang, Y. & Zhang, X. Single-mode laser by parity- 
time symmetry breaking. Science 346, 972-975 (2014). 

Berry, M. Physics of nonhermitian degeneracies. Czech. J. Phys. 54, 1039-1047 
(2004). 

Makris, K., El-Ganainy, R., Christodoulides, D. & Musslimani, Z. H. Beam dynamics 
in PT symmetric optical lattices. Phys. Rev. Lett 100, 103904 (2008). 

Szameit, A. Rechtsman, M. C., Bahat-Treidel, 0. & Segev, M. PT-symmetry in 
honeycomb photonic lattices. Phys. Rev. A 84, 021806 (2011). 

Cao, H. & Wiersig, J. Dielectric microcavities: model systems for wave chaos and 
non-Hermitian physics. Rev. Mod. Phys. 87, 61-111 (2015). 

Sakoda, K. Proof of the universality of mode symmetries in creating photonic Dirac 
cones. Opt. Express 20, 25181-25194 (2012). 

Chan, C., Hang, Z. H. & Huang, X. Dirac dispersion in two-dimensional photonic 
crystals. Adv. Optoelectron. 2012, 313984 (2012). 

Lee, J. et al. Observation and differentiation of unique high-Q optical resonances 
near zero wave vector in macroscopic photonic crystal slabs. Phys. Rev. Lett 109, 
067401 (2012). 

Dembowski, C. et al. Experimental observation of the topological structure of 
exceptional points. Phys. Rev. Lett 86, 787-790 (2001). 


358 | NATURE | VOL 525 | 17 SEPTEMBER 2015 
©2015 Macmillan Publishers Limited. All rights reserved 


26. 


27. 
28. 


29. 


30. 


Supplementary Information is available in 


Acknowledgements We thank T. Savas for 


Hsu, C. W., DeLacy, B. G., Johnson, S. G., Joannopoulos, J. D. & Soljacic, M. 
Theoretical criteria for scattering dark states in nanostructured particles. Nano 
Lett. 14, 2783-2788 (2014). 

Hsu, C. W. etal. Observation of trapped light within the radiation continuum. Nature 
499, 188-191 (2013). 

Suh, W., Wang, Z. & Fan, S. Temporal coupled-mode theory and the presence of 
non-orthogonal modes in lossless multimode cavities. /EEE J. Quantum Electron. 
40, 1511-1518 (2004). 

Chua, S.-L, Lu, L., Bravo-Abad, J., Joannopoulos, J. D. & Soljaci¢, M. Larger-area 
single-mode photonic crystal surface-emitting lasers enabled by an accidental 
Dirac point. Opt. Lett. 39, 2072-2075 (2014). 

Lu, L., Joannopoulos, J. D. & Soljacic, M. Topological photonics. Nature Photon. 8, 
821-829 (2014). 


he online version of the paper. 


abrication of the samples, and F. Wang, 


Y. Yang, N. Rivera, S. Skirlo, O. Miller and S. G. Johnson for discussions. This work was 
partly supported by the Army Research Office through the Institute for Soldier 


Nanotechnologies under contract nos W91 


F-07-D0004 and W911NF-13-D-0001. 


B.Z.,L.L. and M.S. were partly supported by S3TEC, an Energy Frontier Research Center 
funded by the US Department of Energy under grant no. DE-SC0001299. L.L. was 
supported in part by the Materials Research Science and Engineering Center of the 
National Science Foundation (award no. DMR-1419807). I.K. was supported in part by 


Marie Curie grant no. 328853-MC-BSICS. 


Author Contributions All authors discussed 
to the work. 


he results and made critical contributions 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to B.Z. (bozhen@mit.edu) and C.W.H. 
(chiawei.hsu@yale.edu). 


Mae A A fea 


doi:10.1038/nature14987 


Inhomogeneity of charge-density-wave order and 
quenched disorder in a high-T, superconductor 


G. Campi'**, A. Bianconi**, N. Poccia”, G. Bianconi*, L. Barba”, G. Arrighetti®, D. Innocenti*®, J. Karpinski®’, N. D. Zhigadlo’, 


S. M. Kazakov”’®, M. Burghammer””°, M. v. Zimmermann", M. Sprung" & A. Ricci 


It has recently been established that the high-transition-temper- 
ature (high-T.) superconducting state coexists with short-range 
charge-density-wave order’""' and quenched disorder’*” arising 
from dopants and strain’*’’. This complex, multiscale phase 
separation'*”! invites the development of theories of high-temper- 
ature superconductivity that include complexity’””’. The nature of 
the spatial interplay between charge and dopant order that pro- 
vides a basis for nanoscale phase separation remains a key open 
question, because experiments have yet to probe the unknown 
spatial distribution at both the nanoscale and mesoscale (between 
atomic and macroscopic scale). Here we report micro X-ray dif- 
fraction imaging of the spatial distribution of both short-range 
charge-density-wave ‘puddles’ (domains with only a few wave- 
lengths) and quenched disorder in HgBa,CuO, , ,, the single-layer 
cuprate with the highest T., 95 kelvin (refs 26-28). We found that 
the charge-density-wave puddles, like the steam bubbles in boiling 
water, have a fat-tailed size distribution that is typical of self- 
organization near a critical point’’. However, the quenched dis- 
order, which arises from oxygen interstitials, has a distribution 
that is contrary to the usually assumed random, uncorrelated dis- 
tribution’”’’. The interstitial-oxygen-rich domains are spatially 
anticorrelated with the charge-density-wave domains, because 
higher doping does not favour the stripy charge-density-wave pud- 
dles, leading to a complex emergent geometry of the spatial land- 
scape for superconductivity. 

Although it is known that the incommensurate charge-density- 
wave (CDW) order in cuprates (copper oxides) is made of ordered, 
stripy, nanoscale ‘puddles’ with an average of only 3-4 oscillations, 
information about the size distribution and spatial organization of 
these puddles has so far not been available. We present experiments 
that demonstrate that CDW puddles have a complex spatial distribution 
and coexist with, but are spatially anticorrelated to, quenched disorder 
in HgBa,CuO, ; , (Hg1201). The sample we studied is a layered per- 
ovskite at optimum doping with oxygen interstitials (y = 0.12), 
tetragonal symmetry (P4/mmm) and a low misfit strain'**. The 
X-ray diffraction (XRD) measurements (see Methods) show diffuse 
CDW satellites (secondary peaks surrounding a main peak) at 
Icpw = (0.23a*, 0.16c*) in the b* = 0 plane and qcpw = (0.23b*, 
0.16c*) in the a* = 0 plane (where a*, b* and c* are the reciprocal 
lattice units) around specific Bragg peaks, such as (108), below the 
onset temperature Tcpw = 240 K (see Fig. 1a). The component of the 
momentum transfer gcpw in the CuO) plane (0.23a*) in this case is 
smaller than it is in the underdoped case (0.28a*)*. The temperature 
evolution of the CDW-peak profile along a* (in the h direction; 
Fig. 1b) shows a smeared, glassy-like evolution for temperatures 
below Tcpw. The CDW-peak intensity reaches a maximum at 


14-17 


+2,11¢ 


T = 100 K, followed by a drop associated with the onset of supercon- 
ductivity at T= T.. We investigated the isotropic character of the 
CDW in the a-b plane using azimuthal scans, as shown in Fig. Ic. 
We observed an equal probability of vertically and horizontally 
striped CDW puddles. 

Our main result is the discovery of the statistical spatial distribution 
of the CDW-puddle size and density throughout the sample, which 
shows an emergent complex network geometry for the superconduct- 
ing phase. We performed scanning micro X-ray diffraction (SuXRD) 
measurements (see Methods) to extend the imaging of spatial inhomo- 
geneity previously obtained by scanning tunnelling microscopy 
(STM)’° from the surface to the bulk of the sample and from nanos- 
cale to mesoscale spatial inhomogeneity. Clear evidence of the 
inhomogeneous spatial distribution of the CDW is provided by the 
observation of very different CDW-peak profiles collected at different 
illuminated sample spots (see Fig. 1d) corresponding to spots with 
‘large’ and ‘small’ puddles. 

We investigated the temperature dependence of CDW domains by 
recording the CDW-peak intensity and its full-width at half-maximum 
(FWHM) during cooling from 280 K to 85 K. We collected the data in 
two different places on the sample corresponding to ‘large’ and ‘small’ 
CDW puddles. Figure le, f shows the temperature dependence of 
population (intensity), the number of oscillations hopw/Ahcpw 
(where hcpw and Ahcpw are the position and the FWHM of the 
CDW peak profile in units of a*, respectively) and in-plane puddle 
size €, (along the a axis) in large (red filled circles) and small (black 
filled squares) CDW puddles. The broad phase transition appears to be 
arrested, as indicated by the size of the CDW puddles €, = 1/Ahcpw, 
which does not diverge below Tcpw. This behaviour is typical of low- 
dimensional systems with quenched disorder. A map representing the 
spatial organization of the CDW-puddle size is shown in Fig. 1g. The 
probability density function (PDF) of the in-plane CDW-puddle size 
€, is shown in Fig. 1h. 

The PDF has a long fat tail that extends over an order of 
magnitude, and is fitted by PDF(€,)~¢, “ exp(—€,/€,), where 
A&cpw = 2.8 + 0.1 is the critical exponent of the puddle-size power- 
law distribution and €, > 40 nm. Although we can determine that the 
average size of CDW puddles is 4.3 nm (in agreement with previous 
work), PDF(¢,) has a non-Gaussian shape and rare, larger puddles 
reaching sizes of 40 nm are detected. Our finding of a fat-tailed distri- 
bution for the CDW-puddle size is in agreement with previous results 
obtained by STM”. Such structures, where spontaneous breaking of 
both translational symmetry (CDW electronic crystalline phase) and 
gauge symmetry (superconductivity) coexist, have been called super- 
stripes’’. The distribution of the CDW puddles we have found intro- 
duces a substantial topological change to the available space for 


lnstitute of Crystallography, CNR, via Salaria Km 29.300, Monterotondo Roma, |-00015, Italy. 2Rome International Center for Materials Science, Superstripes, RICMASS, via dei Sabelli 119A, |-00185 Roma, 
Italy. SMESA-+ Institute for Nanotechnology, University of Twente, PO Box 217, 7500 AE Enschede, The Netherlands. “School of Mathematics, Queen Mary University of London, London E1 4SN, UK. 
5Institute of Crystallography, Sincrotrone Elettra UOS Trieste, Strada Statale 14 - Km 163,5 Area Science Park, 34149 Basovizza, Trieste, Italy. EPFL, Institute of Condensed Matter Physics, Lausanne CH- 
1015, Switzerland. 7ETH, Swiss Federal Institute of Technology Zurich Laboratory for Solid State Physics, CH-8093 Zurich, Switzerland. ®Department of Chemistry, M.V. Lomonosov Moscow State 
University, Moscow 119991, Russia. European Synchrotron Radiation Facility, BP 220, F-38043 Grenoble Cedex, France. !°Department of Analytical Chemistry, Ghent University, Krijgslaan 281, S12 
B-9000 Ghent, Belgium. !!Deutsches Elektronen-Synchrotron DESY, NotkestraBe 85, D-22607 Hamburg, Germany. 


*These authors contributed equally to this work. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 359 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


100 120 140 160 180 200 220 240 260 280 
Temperature, T (K) 


Intensity (arbitrary units) Tit. 
1.5 2 2:5 3 1 1.5 2 25 
i; .« 0. 4 


(arbitrary units) 
T 
J J 
‘ig 
Oo 
(syun Asesyique) 
AysuejU! PEZI[EWUON 


=n Normalized intensity 


7 


Aepw/AAcpw 
nm wo 
" 
rir 7 \, J eee 
Fae 
wo (=) 
(wu) ’§ ‘ezis urewog 


1 . 
100 150 200 250 
T(k) 


-0.08 0.00 0.08 0.16 1 
9- Wow 


Probability density function 
TITTY 


TTT 


9g &, (nm) 
T TT 
» 1071 e 
10 
1 e 
> 10° 
£ 10° 
10 14 mT 
Domain size, &, (nm) 


Figure 1 | Temperature dependence and spatial distribution of CDW 
puddles in Hg1201. a, The CDW satellite near the (108) Bragg peak appears 
below 240 K. b, Temperature dependence of CDW-peak profiles along h. The 
CDW-peak intensity Icpw is measured as the number of counts minus 

the background. ¢, The qcpw = (0.23a*(b*),0.16c*) peak profile at different 
azimuthal angles « showing the peak isotropy. d, Two typical CDW peaks 
collected at two different places in the same crystal. Red solid circles correspond 
to the diffraction profile from an illuminated part the sample with large 
CDW puddles (red in g); black filled squares correspond to an illuminated part 
of the sample with small CDW puddles (blue in g); and the solid lines are 


superconductivity: the current running from a point A to a point B 
of the material can take different paths (see Fig. 1i) that are not 
topologically equivalent”’ thus forming an emergent complex hyper- 
bolic geometry”. 

To investigate the interplay between the CDW puddles and the 
quenched disorder, we studied the spatial distribution of oxygen 
defects. The quenched lattice disorder is due to oxygen interstitials 
(O;), which form O; atomic stripes in the HgO, layers, in agree- 
ment with previous experiments”””*. HgBa,CuO, + y (ref. 28), like 
YBazCuzO¢ +, (ref. 15) and La,CuO, + , (refs 14, 16), shows T- varia- 
tions, owing to the effect of the spatial organization of O; on 
superconductivity. The average Oj self-organization was detected by 
high-energy XRD (see Methods). Figure 2a shows the (0<h<5, 
0<k<A) portion of reciprocal space, where there is strong evidence 
of diffuse streaks running along the a* and b* directions and crossing 
all the Bragg peaks. Our high-energy XRD data confirm the formation 
of Oj stripes intercalated between the CuO, planes, both in the (100) 
and (010) directions”. 

The spatial distribution of the intensity of the streaks was obtained 
by SUXRD (see Methods). We measured the reciprocal a*-c* plane 
(or b*-c* plane) around the (006) Bragg peak in reflection geometry. 
The O, stripes in Hg1201 run along the a* (b*) direction with no 
correlation along the c* direction; therefore, they also lead to streaks 


360 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©: ( 
© (>) 


Gaussian fits. e, The CDW-peak intensity as a function of temperature, at the 
two different places on the sample corresponding to large (red filled circles, 
right axis) and small (black filled squares, left axis) CDW puddles. The dashed 
line corresponds to T = T, and the dotted line to T= Tcpw. f, Evolution of the 
number of CDW oscillations (4cpw/Ahcpw) inside a CDW puddle and the 
CDW domain size along the a axis (€,). g, h, Spatial map (g) and probability 
density function (h) of the CDW-puddle size. Scale bar in g, 10 jim. i, A 
schematic of non-equivalent paths, running in the interface space between 
CDW puddles, connecting point A to point B in the emergent complex non 
Euclidean spatial geometry””° for the superconducting current. 


102 


on the a*-c* plane. A schematic of O; atomic stripes is shown in 
Fig. 2b. In Fig. 2c we show the spatial map of the streak intensity. 
The picture shows rich (bright yellow) and poor (dark black) regions 
of O; stripes. The PDF of Oj-rich regions in Fig. 2d is fitted by 
PDF(I) ~ (I/Ip) “° exp(I/I,), where Ip is the average intensity, 
do, =2.0 + 0.1 is the critical exponent and I, > 20. 

In Fig. 3 we present results on the spatial interplay between CDW- 
rich regions and Oj-rich regions. We calculated the ‘difference map’ 
(see Methods) between CDW peaks and O; diffuse streaks. The poor 
CDW regions on the CuO, basal plane correspond to Oj-rich regions 
on the HgO, layers, as illustrated in Fig. 3a. The CDW puddles and 
Oj-rich regions give rise to the positive and negative peaks, respect- 
ively, in the surface plot shown in Fig. 3b. The spatial anticorrelation is 
evident from the scatter plot of O; intensity versus CDW intensity 
(Fig. 3c). As O; intensity increases, the CDW intensity decreases, 
and vice versa. This is consistent with the fact that excess O; means 
higher doping, and high doping does not favour stripy, underdoped 
short-range CDW order. Figure 3d shows the two maps obtained via 
the segmentation of the difference map, and provides a direct image of 
how doping-poor (CDW-rich regions are shown in red) and doping- 
rich (Oj-rich regions are shown in blue) phases are arranged in 
different regions of the material. Figure 3e illustrates the nanoscale 
configuration of CDW-puddles (red spots) in the CuO, plane using 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Figure 2 | Correlated quenched disorder due to O; atomic stripes in 
Hg1201. a, A portion of the h-k diffraction pattern. Resolution-limited streaks 
connect the Bragg peaks, owing to the formation of O; stripes in the HgO,, 
spacer layers. b, Schematic representation of the atomic O; stripes. c, SuXRD 


the experimental distribution of CDW size; this distribution generates 
‘holes’ in the space available for the free electrons (light blue area). This 
space is topologically interesting: there are an infinite number of ways 
for a current path to connect a point A to a point B around the CDW 
puddles, which are not only distinguished by the number of times a 
path goes around a single hole, but also by the way the path passes 


a b 


CDW 
puddles 


0; 
atomic stripes 


d CDW-rich 


O-rich 


d 
TAIT T T a am a a el T T tae tan 
10° & q 
FE O, atomic stripes ; 
: L a 
2 is 7 
ae 
o 10 3 
2 E q 
7 E 
Cc 4 
oO 
ne) L 4 
r= 
S 10°E | 
8 E 5 
° E 4 
a E 4 
-3 
10 40° 40 402 
Mo 


map of a region of a showing the relative O; streak intensity Jo,. The bright 
(dark) spots correspond to sample regions with a high (low) density of O; 
atomic stripes, called O;-rich (poor) regions. Scale bar, 10 tum. d, Probability 
density function calculated from the O;-streaks intensity map. 


though the pattern of CDW puddles’’”*. The complex space that 
emerges from the mesoscopic phase separation, both in the spacer 
layers and in the CuQ, plane, substantially changes (1) the dielectric 
constant that controls the long-range Coulomb interaction that is 
relevant for phase separation near a Lifshitz transition”, (2) the dielec- 
tric constant that is relevant to electron-electron interaction in the 


ie) 
fe) 
= 


{e) 


O, intensity (arbitrary units) 


0 0.1 0.2 0.3 0.4 
CDW intensity (arbitrary units) 


a2 
. 
was 
s 


ee 
ola ae 
‘te 


Figure 3 | Spatial anticorrelation between CDW-rich and O;-rich regions. 
a, The CDW-rich regions (red) on the CuO; planes and O;-rich regions (blue) 
on the HgO, layers. b, Surface plot of the difference map (see Methods) between 
the CDW-peak and O;-streak intensity. The positive (green to red) values 
indicate the CDW-rich regions and the negative (green to blue) values 
correspond to O;-rich regions. Scale bar, 5 tum. ¢, Scatter plot of O; versus CDW 
intensity demonstrating the negative correlation between CDW-puddle and 


O,-stripe populations. d, Segmentations of the difference map in b highlighting 
the network of CDW-rich domains (left panel) and O;-rich regions (right 
panel). Scale bar, 10 um. e, A schematic of the nanoscale texture formed by 
CDW-rich regions (red spots) and the ‘charge-O;-rich’ region (light blue area), 
which define an interface space and loci of the superconductivity with a 
complex non-Euclidean geometry”. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 361 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


pairing and (3) the geometrical and topological properties of the space 
that is available for the overall phase coherence of the macroscopic 
quantum condensate that is made up of multiple condensates at the 
nanoscale with a single critical temperature’. 

This work offers new insight into the complexity of nanoscale 
phase-separation phenomena in high-temperature superconductors. 
More generally, our results deal with the effects of quenched disorder 
in phase transitions. A phase transition that would be first order in the 
clean limit gets smeared into a continuous-looking transition in the 
presence of a random, Gaussian distributed, quenched disordered 
background’*”’. Here the disorder itself is not randomly distributed, 
but has a long-tailed probability density function, leading to correlated 
disorder. Even in the ‘ideal’ single-layer cuprate superconductor 
HgBazCuO,+, at optimum doping (T.= 95K), the CDW order 
self-organizes into puddles, forming an inhomogeneous landscape 
with an emergent complex network geometry. Our results provide 
further evidence for the universality of mesoscale phase separation 
even in the most optimized superconducting cuprates, which implies 
that the superconductivity will be non-uniform throughout what is a 
granular medium. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 22 March; accepted 21 July 2015. 


1. Chang, J. et al. Direct observation of competition between superconductivity and 
charge density wave order in YBazCuz06¢.67. Nature Phys. 8, 871-876 (2012). 

2. Croft, T.P., Lester, C., Senn, M.S., Bombardi, A. & Hayden, S. M. Charge density wave 
fluctuations in Lag — ,Sr,CuO, and their competition with superconductivity. Phys. 
Rev. B 89, 224513 (2014). 

3. Comin, R. et a/. Broken translational and rotational symmetry via charge stripe 
order in underdoped YBazCu3Q¢ + y. Science 347, 1335-1339 (2015). 

4. Poccia, N. et a/. Optimum inhomogeneity of local lattice distortions in LagCuOg , y. 
Proc. Nat! Acad. Sci. USA 109, 15685-15690 (2012). 

5. Tabis, W. et al. Charge order and its connection with Fermi-liquid charge transport 
in a pristine high-T, cuprate. Nature Commun. 5, 5875 (2014). 

6. Comin, R. etal. Charge order driven by Fermi-arc instability in BizSro— ,La,CuOg + 5. 
Science 343, 390-392 (2014). 

7. Kohsaka, Y. et al. An intrinsic bond-centered electronic glass with unidirectional 
domains in underdoped cuprates. Science 315, 1380-1385 (2007). 

8. Mesaros, A. et al. Topological defects coupling smectic modulations to intra—unit- 
cell nematicity in cuprates. Science 333, 426-430 (2011). 

9. Phillabaum, B., Carlson, E. W. & Dahmen, K. A. Spatial complexity due to bulk 
electronic nematicity in a superconducting underdoped cuprate. Nature Commun. 
3,915 (2012). 

10. Gabovich, A. M., Voitenko, A. |., Annett, J. F. & Ausloos, M. Charge- and spin-density- 
wave superconductors. Supercond. Sci. Technol. 14, R1-R27 (2001). 

11. Bianconi, A. et a/. Determination of the local lattice distortions in the CuOz plane of 
Lay gsSto.1sCuO,. Phys. Rev. Lett 76, 3412-3415 (1996). 

12. Imry, Y. & Ma, S.-K. Random-field instability of the ordered state of continuous 
symmetry. Phys. Rev. Lett 35, 1399-1401 (1975). 

13. Vojta, T. Rare region effects at classical, quantum and nonequilibrium phase 
transitions. J. Phys. Math. Gen. 39, R143-R205 (2006). 


362 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


14. Fratini, M. et a/. Scale-free structural organization of oxygen interstitials in 

LazCuO, + y. Nature 466, 841-844 (2010). 

15. Ricci, A. et a/. Multiscale distribution of oxygen puddles in 1/8 doped 

YBaz2Cu30667. Sci.Rep. 3, 2383 (2013). 

16. Poccia, N. eta/. Evolution and control of oxygen order in a cuprate superconductor. 

Nature Mater. 10, 733-736 (2011). 

17. Drees, Y. et al. Hour-glass magnetic excitations induced by nanoscopic phase 

separation in cobalt oxides. Nature Commun. 5, 5731 (2014). 

18. Gor’kov, L. P. & Teitel’baum, G. B. Two-component energy spectrum of cuprates in 

he pseudogap phase and its evolution with temperature and at charge ordering. 

Sci. Rep. 5, 8524 (2015). 

19. Bianconi, A. Shape resonances in superstripes. Nature Phys. 9, 536-537 (2013). 

20. Alvarez, G., Moreo, A. & Dagotto, E. Complexity in high-temperature 
superconductors. Low Temp. Phys. 32, 290-297 (2006). 

21. Kresin, V., Ovchinnikov, Y. & Wolf, S. Inhomogeneous superconductivity and the 
“pseudogap” state of novel superconductors. Phys. Rep. 431, 231-259 (2006); 
erratum 437, 233-234 (2007). 

22. She, J.-H. & Zaanen, J. BCS superconductivity in quantum critical metals. Phys. 
Rev. B 80, 184518 (2009). 

23. Bianconi, G. Superconductor-insulator transition on annealed complex networks. 
Phys. Rev. B 85, 061113 (2012). 

24. Davison, R.A. Schalm, K. & Zaanen, J. Holographic duality and the resistivity of 
strange metals. Phys. Rev. B 89, 245116 (2014). 

25. Lucas, A. & Sachdev, S. Conductivity of weakly disordered strange metals: from 
conformal to hyperscaling-violating regimes. Nucl. Phys. B 892, 239-268 (2015). 

26. Karpinski, J. et al. High-pressure synthesis, crystal growth, phase diagrams, 
structural and magnetic properties of Y2BaqgCu,O2n + x, HgBazCan — 1CunO2n +245 
and quasi-one-dimensional cuprates. Supercond. Sci. Technol. 12, R153-R181 
(1999). 

27. Wagner, J. L. et al. in Phase Transitions and Self-Organization in Electronic and 
Molecular Networks (eds Thorpe, M. F. & Phillips, J. C.) 331-339 (Springer, 2001). 

28. Izquierdo, M. et al. One dimensional ordering of doping oxygen in HgBazCuO, + 5 
superconductors evidenced by X-ray diffuse scattering. J. Phys. Chem. Solids 72, 
545-548 (2011). 

29. Zeng, W., Sarkar, R., Luo, F., Gu, X. & Gao, J. Resilient routing for sensor networks 
using hyperbolic embedding of universal covering space. In INFOCOM, 2010 Proc. 
IEEE 1-9 (|EEE, 2010). 

30. Wu, Z., Menichetti, G., Rahmede, C. & Bianconi, G. Emergent complex network 
geometry. Sci. Rep. 5, 10073 (2015). 


Acknowledgements We acknowledge the ESRF, ELETTRA and DESY synchrotron 
facilities for radiation-beam time and support. We thank the beamline scientists for 
help with experiments. We acknowledge the Calypso programme for travel support. We 
acknowledge support from the Superstripes Institute. N.P. acknowledges financial 
support from a Marie Curie Intra-European Fellowship for career development. 


Author Contributions All authors have contributed to essential portions of this work. 
The experiment was conceived by A.B., G.C., N.P., G.B. and A.R.; the Hg1201 crystals 
were grown at ETH by S.M.K. and J.K.; N.D.Z. and N.P. performed magnetic 
characterization of Hg1201 single crystals; experiments at DESY were performed by 
AR., G.C., N.P., M.S., M.v.Z. and D.I.; experiments at ESRF were performed by AR., N.P., 
G.C., A.B. and M.B.; experiments at Elettra were performed by G.C., L.B., G.A., A.B. and 
A.R.; and the data analysis was carried out by G.C., A.B., G.B. and AR. All authors 
discussed the results and contributed to the writing of the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to A.B. 
(antonio.bianconi@ricmass.eu). 


©2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


Sample preparation and characterization. The HgBa,CuO, , , (Hg1201) crystal 
with y=0.12, grown at ETH™, has a sharp superconducting transition at 
T, = 95K. The crystal structure has P4/mmm symmetry with lattice parameters 
a= b = 0.387480(5) nm and c = 0.95078(2) nm at T= 100 K (numbers in par- 
entheses indicate the standard deviation of the last digit). 

XRD measurements using the XRD1 beamline. To identify the CDW order ina 
single Hg1201 crystal we used XRD using the XRD1 beamline at the Elettra 
synchrotron radiation facility in Trieste, Italy, tuning the photon energy between 
13 keV and 16keV with a beam size of 200 X 200 um’. Only selected reflections 
show clear CDW satellites, in agreement with ref. 2. We focused on the CDW 
satellite located at qcpw = (0.23, 0, 0.16) around the (108) Bragg reflection, which 
appeared as the sample was cooled below 240K. Typical diffraction patterns 
collected at 85 K, 105 K and 280 K are shown in Fig. la. To get a direct view of 
the temperature dependence of the CDW-satellite reflection for T = 280-85 K, a 
two-dimensional colour plot of the CDW-peak profile along the a* direction as a 
function of temperature is shown in Fig. 1b. 

High-energy XRD measurements using the BW5 beamline. High-energy XRD 
measurements were collected using the BW5 beamline at DESY, Hamburg, 
Germany, using a transmission geometry and an X-ray energy of 100KeV. 
A single SiGe(111) gradient monochromator was used. The beam size was 
200 um X 200 um. We used a vertical rotation axis, and the c axis of the single 
crystal was oriented parallel to the direction of the incoming X-ray beam. 
The diffraction patterns were collected by an area detector in the temperature 
range 20-300 K. In this geometry, we can probe the lattice fluctuations on the 
a-b plane. The possible CDW-peak anisotropy was seen using azimuthal scans 
(0° << 90°), as shown in the schematic in Fig. 1c. The CDW-peak amplitude 
and FWHM do not change substantially as a function of ~, as shown in the colour 
plot of the diffraction profile (as a function of «) in Fig. 1c; instead, this plot shows 
the CDW planar isotropy, in agreement with the tetragonal P4/mmm symmetry of 
the lattice. The presence of resolution-limited streaks connecting the Bragg peaks, 
owing to the organization of single O; stripes in the mercury spacer layer, are 
shown in Fig. 2a. This figure shows a portion of the h-k diffraction pattern that was 
collected at DESY. The spatial distribution of the O; atomic stripes in the HgO, 
spacer layers, which cause one-dimensional doping and lattice spatial inhomo- 
geneity, does not vary with temperature below 250 K, leading to quenched disorder 
at the onset of the charge-density phase. 

SpXRD measurements using the ID13 beamline. SXRD experiments were 
performed in reflection geometry using the ID13 beamline at ESRF, Grenoble, 
France. We applied incident X-ray energy of 13 KeV. By moving the sample under 
a 1-4t1m focused beam with an x-y translator, we scanned a sample area of 
65 X 80 um?, collecting 5,200 different diffraction patterns at T= 100K. For each 
scanned point of the sample, the qcpw-peak profile was extracted; the FWHMs 
along the a*(b*) and c* directions were evaluated to obtain the domain size of the 
charge-ordered regions along the a(b) and c crystallographic axes. 


LETTER 


Two quite different CDW-peak profiles in the same crystal measured along the 
a*(b*) direction in the a*-c*(b*-c*) plane are shown in Fig. 1d. Here we show two 
typical profiles collected at two different spatial locations in the same crystal 
corresponding to large (red circles) and small (black squares) puddles. The con- 
tinuous lines are the Gaussian fits to the data. The different amplitudes and 
FWHMs ((0.033 + 0.001)a* and (0.089 + 0.001)a* in the upper and lower panels 
of Fig. 1d, respectively, errors indicate standard deviation) of the two peaks, which 
correspond to large and small CDW puddles, provide evidence of a strong 
inhomogeneity in the CDW spatial distribution. The peak profiles do appear 
the same along the a* and b* directions, confirming the peak isotropy in the basal 
plane of the tetragonal lattice. The intensity of the CDW satellites as a function of 
temperature, measured at two different locations on the sample corresponding to 
large (red) and small (black) CDW puddles are shown in Fig. le. The vertical lines 
represent the superconducting temperature T. and the CDW onset temperature 
Tcpw- The order-disorder transition is very broad, which indicates the role of the 
quenched disorder owing to the presence of defects. Moreover, the CDW intensity 
shows a clear drop around T, that appears to depend on the CDW puddle size. The 
temperature dependence of the number of CDW oscillations inside a single puddle 
(hcpw/Ahcpw) and the domain size of a single puddle along the a(b) axis (€,) are 
shown in Fig. 1f. (Akcpw and hcpw are the FWHM and the location along a* of 
the CDW peak; the domain size along the a axis (b axis) is given by the correlation 
length €,.) The inhomogeneity of the CDW distribution is depicted in the 65 x 80- 
jim* XRD map of the (nanoscale) size of CDW domains in Fig. 1g. This figure 
shows loci of large (red-yellow area) and small (blue area) CDW puddles. The 
scale bar corresponds to 10 tm. Using the ID13 microfocus beamline at ESRF, we 
can also detect the spatial distribution of the quenched disorder. Figure 2c shows 
the SuXRD map of the integrated intensity of the streaks of O; stripes. The bright 
(dark) spots correspond to sample regions with a high (low) density of O; atomic 
stripes, called O;-rich (poor) regions. The scale bar is 10 jm. Figure 2d shows the 
PDF of the O;,-streak intensity that was obtained from the SXRD map. This plot 
shows the probability distribution of the O;-rich regions. The experimental set-up 
allows us to investigate the spatial interplay between CDW puddles in the CuO, 
plane and O;-rich domains in the HgO, layers, shown in Fig. 3. We measured the 
‘difference map’ (Icpw —Io,), where (Icpw) and (Io,) are the intensities of the 
dcpw peak and the O;, diffuse streaks, respectively, normalized to [0, 1]. The 
surface plot of this difference map is shown in Fig. 3b. The positive (green to 
red) peaks indicate CDW-puddle-rich regions and the negative (green to blue) 
peaks indicate O;-rich regions. The spatial anticorrelation between CDW puddles 
and O; atomic stripes is obtained by segmentation of the difference map. We use 
this segmentation to visualize the phase separation owing to the network of CDW- 
rich domains, which correspond to ‘charge poor’ domains in the CuO, planes (left 
panel of Fig. 3d), and O;-rich regions in the HgO, layers, which correspond to 
‘charge rich’ portions of the CuO, plane (right panel of Fig. 3d). 

Code availability. The code we used for statistical analysis of the SuXRD data is 
not currently available (G.C., A.R. and A.B., manuscript in preparation). 


©2015 Macmillan Publishers Limited. All rights reserved 


Mae A A fea 


doi:10.1038/nature14881 


Designing switchable polarization and 
magnetization at room temperature in an oxide 


P. Mandal, M. J. Pitcher', J. Alaria’, H. Niu!, P? Borisov‘, P. Stamenov’, J. B. Claridge’ & M.J. Rosseinsky! 


Ferroelectric and ferromagnetic materials exhibit long-range order 
of atomic-scale electric or magnetic dipoles that can be switched by 
applying an appropriate electric or magnetic field, respectively. 
Both switching phenomena form the basis of non-volatile random 
access memory’, but in the ferroelectric case, this involves destruct- 
ive electrical reading and in the magnetic case, a high writing 
energy is required’. In principle, low-power and high-density 
information storage that combines fast electrical writing and mag- 
netic reading can be realized with magnetoelectric multiferroic 
materials’. These materials not only simultaneously display ferro- 
electricity and ferromagnetism, but also enable magnetic moments 
to be induced by an external electric field, or electric polarization 
by a magnetic field*°. However, synthesizing bulk materials with 
both long-range orders at room temperature in a single crystalline 
structure is challenging because conventional ferroelectricity 
requires closed-shell d° or s* cations, whereas ferromagnetic order 
requires open-shell d” configurations with unpaired electrons®. 
These opposing requirements pose considerable difficulties for 
atomic-scale design strategies such as magnetic ion substitution 
into ferroelectrics”*. One material that exhibits both ferroelectric 
and magnetic order is BiFeO3, but its cycloidal magnetic structure? 
precludes bulk magnetization and linear magnetoelectric coup- 
ling”. A solid solution of a ferroelectric and a spin-glass perovskite 
combines switchable polarization’' with glassy magnetization, 
although it lacks long-range magnetic order’. Crystal engineering 
of a layered perovskite has recently resulted in room-temperature 
polar ferromagnets’’, but the electrical polarization has not been 
switchable. Here we combine ferroelectricity and ferromagnetism 
at room temperature in a bulk perovskite oxide, by constructing a 
percolating network of magnetic ions with strong superexchange 
interactions within a structural scaffold exhibiting polar lattice 
symmetries at a morphotropic phase boundary’* (the composi- 
tional boundary between two polar phases with different polariza- 
tion directions, exemplified by the PbZrO3;-PbTiO; system) that 
both enhances polarization switching and permits canting of the 
ordered magnetic moments. We expect this strategy to allow the 
generation of a range of tunable multiferroic materials. 

Several approaches to room-temperature multiferroicity have been 
explored. Composite multiferroics, which are multiphase mixtures 
of magnetic and ferroelectric materials, have displayed the largest 
magnetoelectric effects, originating from stress-mediated coupling”. 
The indirect nature of the cross-coupling between the polar and 
magnetic phases hinders complete switching of the ferroic properties 
through magnetoelectric coupling. The single-phase oxide BiFeO; is 
antiferromagnetically ordered with competing exchange interactions 
producing a cycloidal structure with a period of 62 nm (ref. 9). 
Two approaches have been used to disrupt this cycloid. First, solid 
solutions of the non-polar, weakly ferromagnetic LnFeO; (Ln = Sm, 
Dy, La) ferrites in BiFeO3 have a finite magnetization at room 
temperature’®”’ in a fully ordered magnetic network. The inherent 
trade-off between the soft magnetic properties of the orthoferrite and 


the ferroelectric properties of BiFeO; leads to intermediate com- 
positions for which the long-range crystallographic symmetry (polar 
versus non-polar)'*, the magnetic ground state and switchability”” 
are subject to debate. Second, strained and nanostructured BiFeO; 
films have shown remanent magnetization”, and electrical control 
of the staggered magnetization in BiFeO3 can switch the magne- 
tization of a coupled ferromagnetic material in a thin-film-device 
structure’. 

In BiFeOs, the ferroelectric polarization is aligned along the [111], 
direction of the primitive cubic (indicated by the subscript p) ABO; 
perovskite subcell. The morphotropic phase boundary (MPB) between 
two non-cubic, polar crystallographic symmetries of the ABO; per- 
ovskite with distinct polarization directions is a route to large, switch- 
able polarization via polarization rotation or reorientation’*. The 
structure at the MPB is a single perovskite network with a complex 
domain microstructure”, where the Bragg scattering can be modelled 
in single- or multiple-phase approximations”’. We have recently pro- 
duced a new MPB in a solid solution between rhombohedral (R, space 
group R3c, Fig. 1a) [111], and orthorhombic (O, space group Pna2, 
Fig. 1b) [001], polarization directions in the Bi>*-based perovskites 
(1 — x)BiTig — yy2FeyMgq — yy203-(x)CaTiO3: the MPB occurs for 
0.075 =x<0.175, y=0.25 (Fig. 1c)**. This new MPB affords 
large switchable polarizations (P) in bulk materials, for example, 
P=49uCcm ” for x = 0.15, y = 0.25. These materials were designed 
to have high d° Ti** and Mg*" cation content on the octahedral B site 
to minimize dielectric loss and aid ferroelectric switching by sustaining 
the required electric field. Because the MPB structure is based on a 
continuous ABO; network, there is a coherent magnetic B-site sublattice 
that is connected by B-O-B superexchange pathways throughout each 
crystallite. The low Fe content of 21.25% at x = 0.15, y = 0.25 is below the 
percolation threshold for the primitive cubic lattice (Fig. le): because 
magnetic order in insulators arises from nearest-neighbour superex- 
change, such a material cannot display long-range magnetic order, and 
none is observed (demonstrated by the monotonic temperature depend- 
ence of the field-cooled (FC) and zero-field-cooled (ZFC) magnetizations 
and the linear magnetization (M(H), where H is the magnetic field) 
isotherm at 100 K, Extended Data Fig. 1). 

We therefore explored increasing the Fe content ((1 — x) X y) on 
the B site to generate long-range magnetic order within an MPB sys- 
tem that displays switchable polarization (Fig. 1d, f). A series of com- 
positions in the range x = 0.15, 0.60 = y = 0.90 were prepared and the 
perovskite phase purity was confirmed by powder X-ray diffraction 
(PXRD; Extended Data Fig. 2). The compositions x = 0.15, y = 0.60 
and x= 0.15, y= 0.80 were selected for detailed property studies. 
Pawley refinements on these compositions show that a model with 
both R and O phases, and a single-phase monoclinic model in a 
V2ap x 2p X V2ap unit cell (space group P11a, which is a polar sub- 
group of both R3c and Pna2,; refined lattice parameters shown in 
Extended Data Table 1), produce superior fits to those obtained by 
a purely rhombohedral model (Extended Data Fig. 3 and Extended 
Data Table 1). This result demonstrates that these compositions exist 


1Department of Chemistry, University of Liverpool, Liverpool L69 7ZD, UK. Department of Physics, University of Liverpool, Liverpool L69 7ZE, UK. 3CRANN, Trinity College Dublin, College Green, Dublin 2, 
lreland. +Present address: Department of Physics, West Virginia University, Morgantown, West Virginia 26506, USA. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 363 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Increase x 


>0.05 
e 
Decrease x 
<0.175 e 
Ferroelectric [R3c + Pna2,] 
b d Increase y f 
@ 
P ° 
° & ° 
[114], 
+ 
[001], 
° e 8 
= ° 
® ® 
Polar Pna2, Long-range magnetic order 
g h i 
2 24 2] 
oy R3c e R3c + Pna2, cm Pita 
ral ray 
& 1 g 1 § 14 
Fs € 2 
& S & 
g 2 g 
Sof 5 03 Sof 
=] 
jo} 
8 at [fo 8 serial the a Pn 
36.0 36.5 37.0 37.5 38.0 38.5 36.0 36.5 37.0 37.5 38.0 38.5 36.0 365 37.0 37.5 38.0 38.5 
20 (°) 26 (°) 20 (’) 


Figure 1 | Crystal structure, magnetic percolation and the morphotropic 
phase boundary (MPB) in (1 — x)BiTiq — y/2FeMgq — y/203-(x)CaTiO; 
where 0 = x= 0.35 and 0.25 = y=0.90. a, Schematic diagram of the purely 
rhombohedral (R3c) structure where x = 0.05, y = 0.25, represented in the 
cubic perovskite subcell with polar displacement of Bi along the [111], axis. The 
blue, orange and red spheres indicate Bi/Ca, Fe/Ti/Mg and O respectively. 

b, The purely orthorhombic (Pna2,) structure where 0.175 = x = 0.35, y = 0.25 
with polar displacement of Bi along the [001], axis. c, The ferroelectric MPB 
observed for 0.075 = x < 0.175, y = 0.25, shown with superimposed polar 
displacements of Bi along the [111], and [001], axes. d, Long-range magnetic 
order at 300 K for x = 0.15, y = 0.80, with a proposed magnetic structure 
(orange arrows) based on a G-type antiferromagnetic arrangement and spins 
oriented perpendicular to the [111], polarization direction. e, f, Schematic 


in the MPB region towards the rhombohedral limit, and hence are 
long-range ordered, polar, non-cubic materials. 

The x = 0.15, y= 0.60 material has 51% Fe present on the B site, 
which is above the percolation threshold for long-range magnetic 
order, and has low dielectric loss despite the enhanced d-electron 
content (Extended Data Fig. 4a). It retains the polarization switching 
characteristics of the MPB and is a ferroelectric at room temperature 
with a maximum polarization (Pmax) of 47.1 uC cm” (Fig. 2a). 
Positive-up negative-down (PUND) measurements confirm the 
intrinsic nature of the measured polarization, with a remanent polar- 
ization of 41.5Ccm~*. The 300K Méssbauer spectrum, which 
probes all the Fe nuclei in the sample, is a sharp paramagnetic doublet 
(Fig. 2b), showing that the material is not magnetically ordered at 
room temperature. The good single paramagnetic component fit 
(hyperfine field Bys = 0) to the data excludes any magnetically ordered 
impurity phases with concentrations higher than 2 wt%. The isomer 
shift of 6 = 0.22(3) mms | (where the number in parentheses repre- 
sents the standard error) corresponds to Fe** ina homogeneous, 
distorted octahedral environment. The loop observed in the M(H) 
isotherm at 300 K, which has a small coercive field, is therefore assoc- 
iated with trace amounts (below diffraction detection limits) of Fe-rich 
ferrimagnetic impurities (for example, Fe;0, or MgFe2O, spinels; 


364 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


diagrams of the MPB microstructure and nearest-neighbour magnetic 
exchange pathways for x = 0.15, y = 0.25 (e) and for x = 0.15, y = 0.80 

(f). Each square represents a perovskite unit cell (rhombohedral in purple and 
orthorhombic in green), brown dots are distributed randomly to represent unit 
cells containing Fe, and the associated brown lines represent magnetic 
exchange pathways. A percolating exchange pathway spanning the sample is 
absent in e but present in f. g-i, Pawley fits (red lines) to PXRD data (black 
circles) in the angular range 36.0° = 20 = 38.5° from composition x = 0.15, 

y = 0.80, modelled using a single R3c unit cell (g), superimposed R3c and Pna2, 
unit cells (h) anda single monoclinic P11a unit cell (i); see Extended Data Fig. 3 
for full patterns. Teal line, difference between the measured and fitted data; 
purple markers, hkl (R3c) reflections; green markers, hkl (Pna2,) reflections; 
magenta markers, hkl (P11a) reflections. 


Extended Data Fig. 5), and does not correspond to long-range order 
of the perovskite. Magnetic ordering in the perovskite below 350 K was 
probed by dc-SQUID (superconducting quantum interference device) 
ZFC and FC magnetization, and thermal remanent magnetization 
(TRM) measurements (Fig. 2c). The large divergence between the 
ZFC and FC data indicates the onset of weak ferromagnetism at the 
Néel temperature, Ty = 205 K, consistent with the Brillouin-like drop 
in the TRM. No other sign of magnetic ordering at lower temperature 
is observed (Extended Data Figure 6a), suggesting that, at the MPB, the 
perovskite behaves as a single-phase magnetic material; this suggestion 
is consistent with the sharpness of the Méssbauer spectrum. The M(H) 
isotherm at 10 K (Fig. 2d) can be decomposed into two components, a 
soft magnetic phase, which is associated with a trace amount (approxi- 
mately 0.6 wt%, consistent with the 300K measurement shown in 
Extended Data Fig. 5c) of the Fe-rich spinel ferrite impurity, and a 
harder phase with an open hysteresis loop and a linear high-field 
contribution, which are characteristic features of a weak ferromagnet 
(Extended Data Fig. 5a). This harder magnetic phase is attributed to 
the perovskite compound with an extracted coercive field of 376 mT 
and saturation magnetization of 0.013 jg per Fe, confirming that the 
material is a weak ferromagnet, where the magnetization arises from 
ferromagnetic canting of a predominantly antiferromagnetic magnetic 


©2015 Macmillan Publishers Limited. All rights reserved 


a b 
F oO 
oO 
= x 
% 5 
& Ho 
7 = 
Qa 
3 3 
a 8 
> 72 
2 
w 
®o 
33 
-200 -150-100 -50 0 50 100 150 200 8 6 4 -2 
E (kV cm) 
d e 


M (Am? kg") 
a(psm") 


Mag (102 A mr?) 
oO wo o oO 
qa 
es 


-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.5 2.0 0) 50 100 


HoH (1) 


1.0 


Figure 2 | Ferroelectric, magnetic and magnetoelectric properties of 
composition x= 0.15, y= 0.60. a, Polarization P versus applied electric field E 
at 300 K showing ferroelectric switching, measured at 10 Hz. Filled squares and 
circles represent the remanent polarizations from PUND measurements. Each 
colour represents the maximum applied electric field in the P(E)/PUND 
measurements; dotted lines are included as visual aids. b, Méssbauer spectrum 
measured at 300 K, with no applied magnetic field (black circles) and a single 
paramagnetic component fit (red line). c, dc magnetization measurement of TRM 
(blue line) and ZFC/FC (black/red lines) magnetization. The red dotted line 
indicates T = Ty. The FC and ZFC data only converge at the highest temperature, 


P (uC cm) 
Relative absorption (x10°3) & 


Velocity (mm s“) 


150 200 250 300 350 


LETTER 


¢ 0.12 

4 0.09 
0.06 
0.03 7 
0.00 


100 150 200 250 300 350 


M (Am? kg") 


0 2 4 6 


PEECECEECC EEO RE Steed 


0.0 05 10 15 20 25 3.0 35 4.0 


T (K) E, (kV cm’) 

because the small divergence above the Néel temperature arises from an impurity 
phase with an ordering temperature above the highest measured T. d, Isothermal 
magnetization M(H) at 10 K (black circles) and the sum of the perovskite and 
spinel impurity phase contributions (red line). e, Temperature dependence of the 
linear magnetoelectric susceptibility («). The data points (blue squares) are the 
mean values from 10 repeated measurements, with standard errors shown in red. 
The red dotted line is T = Ty. f, Induced ac magnetization (M,.) versus applied ac 
electric field amplitude (E,.) at 150 K (black squares) and 300 K (red circles). 
The data points are the mean values from 10 repeated measurements, with 
standard errors shown in red; the blue lines are linear fits to the data. 


M (Am? kg") 


100 150 200 250 300 350 400 


-200 -150 -100 -50 0 50 100 150 200 ie} 50 
E (kV cm") T(K) 
d 06 e f ¢ 
= 08 al : rm 
a = = Ee 4 
a A : < 
RS & na i = 
Ss 8 -0.5 ae : “9 2 
-0.3 a s 
-0.6 -1.0 : T T T T T T T - oO 
~2.0 -1.5 -1.0-05 0.0 05 1.0 1.5 2.0 0 50 100 150 200 250 300 350 400 0.0 0.5 1.0 1.5 2.0 2.5 3.0 
MH (T) T(K) E,,(kV cm") 


Figure 3 | Ferroelectric, magnetic and magnetoelectric properties of 
compositions x = 0.15, y= 0.80. a, Polarization P versus applied electric field 
E at 300 K showing ferroelectric switching, measured at 10 Hz. Filled squares 
and circles represent the remanent polarizations from PUND measurements. 
Each colour represents the maximum applied electric field in the P(E)/PUND 
measurements; dotted lines are included as visual aids. b, Méssbauer 

spectrum measured at 300 K, with no applied magnetic field (black circles) and the 
multicomponent fit (red line). Individual components 1-4 (green, blue, cyan 
and magenta, respectively) are described in the text and summarized in 
Extended Data Table 2. c, dc magnetization measurement of TRM (blue line) 


and ZFC/FC (black/red lines) magnetization. The red dotted line indicates 

T = Ty. d, Isothermal magnetization M(H) at 300 K (black circles) and the sum of 
the perovskite and minority phase spinel contributions (red line). e, Temperature 
dependence of the linear magnetoelectric susceptibility (~). The data points 
(blue squares) are the mean values from 10 repeated measurements, with standard 
errors shown in red. The red dotted line is T= Ty. f, Induced ac magnetization 
(M,c) versus applied ac electric field amplitude (E,.) at 300 K (red circles). 

The data points are the mean values from 10 repeated measurements, with 
standard errors shown in red; the blue line is a linear fit to the data. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 365 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


structure. The structural symmetries present at the MPB all permit 
canting to occur within the G-type antiferromagnetic arrangement 
that is generally found for perovskite ferrites” (Fig. 1d). 

To confirm whether the two order parameters P and M are coupled, 
magnetoelectric measurements were performed on a disk that was 
poled both electrically and magnetically. The x = 0.15, y = 0.60 mater- 
ial displays linear magnetoelectric coupling (measured as the slope of 
the induced ac magnetization (M,.) versus the applied ac electric field 
amplitude (E,.)) only below the long-range ordering temperature of 
205 K (Fig. 2e, f). At 10 K, the material shows a pronounced magneto- 
electric susceptibility « (= gM,,./E,. where [Ug is the vacuum permeab- 
ility) of —1.11(1) ps m" (Extended Data Fig. 7; the number in 
parentheses represents the standard error), which changes sign upon 
warming to Ty (ref. 26). The residual 300 K magnetoelectric coupling 
is an order of magnitude smaller than that in the magnetically ordered 
state (Fig. 2f compares data below Ty at 150 K and above Ty at 300 K) 
and can be associated with composite effects*'* involving the magnetic 
minority phases that are not integrated into the complex MPB micro- 
structure of the ABO; perovskite network. 

The 51% Fe content at x = 0.15, y = 0.60 is sufficient to percolate and 
give long-range magnetic order, but the mean exchange field is too weak 
for room-temperature magnetization, because the effective number of 
nearest neighbours for superexchange is too low. The x = 0.15, y = 0.80 
composition gives 68% B-site occupancy by Fe**, Méssbauer spectro- 
scopy demonstrates that this increased coverage produces bulk mag- 
netic order at 300 K (Fig. 3b), in contrast to x = 0.15, y = 0.60, because 
there are no paramagnetic contributions to the spectrum (Extended 
Data Table 2). ZFC/FC magnetization and TRM measurements 
(Fig. 3c) show that Ty increases to 370 K. The majority (98.6(2)%) 
components 1 and 2 arise from the magnetically ordered MPB perovs- 
kite. Component | corresponds to Fe*” ina slightly distorted octahed- 
ral environment (6 = 0.29(5) mm s ’, electric quadrupole moment 
Q = 0.033(7) mms). The broader local field distribution and reduced 
hyperfine field in component 2 reflect the different local magnetic 
environments in a percolating system’’. The minority (1.3(2)%) com- 
ponents arise from spinel-derived Fe** (6 = 0.3mm s_'). There is no 
signature of further magnetic ordering at lower temperature in the 
TRM plot (Extended Data Fig. 6b). The 300 K (below Ty) M(H) iso- 
therm for the x = 0.15, y = 0.80 composition is similar to that observed 
for x = 0.15, y = 0.60 in the magnetically ordered state at 10 K: there are 
two components, a soft phase that is attributed to the high Fe content 
impurity (approximately 0.7 wt%), and a harder phase that is assigned 
to the perovskite with a coercive field (367 mT) and remanent magnet- 
ization (0.008 jy per Fe) consistent with bulk weak ferromagnetic 
behaviour (Fig. 3d and Extended Data Fig. 6b). The x= 0.15, 
y = 0.80 material is a ferroelectric at room temperature with a switch- 
able maximum polarization (Pinax) of 49.9 uC cm” (Fig. 3a): a ferro- 
electric polarization is still measurable at 473 K (Extended Data Fig. 8). 
The remanent polarization obtained from PUND measurement is 
43.7 uC cm * (Fig. 3a and Extended Data Fig. 4c). PUND and leakage 
current measurements confirm the intrinsic origin of the polarization, 
consistent with the 300K dc resistivity of 2.1 x 10'7Q cm (Extended 
Data Fig. 4d). The switching of the intrinsic perovskite weak ferromag- 
netic magnetization at the bulk coercive field thus coexists with the 
switching of the ferroelectric polarization at room temperature. 

The long-range ordered P and M in x = 0.15, y = 0.80 afford bulk 
magnetoelectric coupling at room temperature with a linear magneto- 
electric susceptibility («) of 0.26(1) ps m? (Fig. 3f). Variable temper- 
ature measurements (Fig. 3e) show that « is —0.91(1) ps m ‘at10K, 
with a change of sign similar to that found for x = 0.15, y = 0.60 upon 
heating. The linear magnetoelectric susceptibility tends to zero at the 
bulk Ty (Fig. 3e), demonstrating that it arises from interaction 
between the coexisting magnetic and electric long-range orders. The 
x= 0.15, y= 0.80 material with 68% Fe** B-site occupancy (and 


366 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


thus a percolating network of Fe-O-Fe superexchange paths to give 
long-range magnetic order) is a room-temperature magnetoelectric 
ferromagnetic ferroelectric material. The introduction of high-temper- 
ature long-range magnetic order into MPB systems is a diversifiable 
strategy for the generation of tunable multiferroic materials. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 12 March; accepted 29 June 2015. 


1. Zhu, J.-G. Magnetoresistive random access memory: the path to competitiveness 
and scalability. Proc. EEE 96, 1786-1798 (2008). 
Bibes, M. Nanoferronics is a winning combination. Nature Mater. 11, 354-357 (2012). 
Scott, J. F. Data storage: multiferroic memories. Nature Mater. 6, 256-257 (2007). 
Weisheit, M. et al. Electric field-induced modification of magnetism in thin-film 
ferromagnets. Science 315, 349-351 (2007). 
5. Chen, X., Hochstrat, A., Borisov, P. & Kleemann, W. Magnetoelectric exchange bias 
systems in spintronics. Appl. Phys. Lett. 89, 202508 (2006). 
6. Hill, N. A. Why are there so few magnetic ferroelectrics? J. Phys. Chem. B 104, 
6694-6709 (2000). 
7. Smolenskii, G.A. & Chupis, |. E. Ferroelectromagnets. Sov. Phys. Usp. 25, 475-493 
(1982). 
8. Fiebig, M. Revival of the magnetoelectric effect. J. Phys. D 38, R123-R152 (2005). 
9. Sosnowska, |., Neumaier, T. P. & Steichele, E. Spiral magnetic ordering in bismuth 
ferrite. J. Phys. C15, 4835-4846 (1982). 
10. Popov, Y. F. et a/. Linear magnetoelectric effect and phase transitions in bismuth 
ferrite, BiFeO3. JETP Lett. 57, 69-73 (1993). 
1. Evans, D. M. etal. Magnetic switching of ferroelectric domains at room temperature 
in multiferroic PZTFT. Nature Commun. 4, 1534 (2013). 
12. Chillal, S. etal, Magnetic short- and long-range order in PbFeg.5Ta9.503. Phys. Rev.B 
89, 174418 (2014). 
13. Pitcher, M.J. etal. Tilt engineering of spontaneous polarization and magnetization 
above 300 K in a bulk layered perovskite. Science 347, 420-424 (2015). 
14. Damjanovic, D. A morphotropic phase boundary system based on polarization 
rotation and polarization extension. Appl. Phys. Lett 97, 062906 (2010). 
15. Nan, C.-W., Bichurin, M.1., Dong, S., Viehland, D. & Srinivasan, G. Multiferroic 
magnetoelectric composites: historical perspective, status, and future directions. 
J. Appl. Phys. 103, 031101 (2008). 
16. Yuan,G.L.& Or, S. W. Multiferroicity in polarized single-phase Bio.g75SMo125FeO3 
ceramics. J. Appl. Phys. 100, 024109 (2006). 
7. Zhang, S. et al. Observation of room temperature saturated ferroelectric 
polarization in Dy substituted BiFeO3 ceramics. J. Appl. Phys. 111, 074105 (2012). 
18. Arnold, D. Composition-driven structural phase transitions in rare-earth-doped 
BiFeO3 ceramics: a review. [EEE Trans. Ultrason. Ferr. 62, 62-82 (2015). 
19. Khomchenko, V. A. et a/. Structural, ferroelectric and magnetic properties of 
Bio.gsSMo,15FeO3 perovskite. Cryst. Res. Technol. 46, 238-242 (2011). 
20. Heron, J. T. et al. Electric-field-induced magnetization reversal in a ferromagnet- 
multiferroic heterostructure. Phys. Rev. Lett 107, 217202 (2011). 
21. Heron, J. T. etal. Deterministic switching of ferromagnetism at room temperature 
using an electric field. Nature 516, 370-373 (2014). 
22. Zhang, N. et al. The missing boundary in the phase diagram of PbZr1_,Ti,O3. 
Nature Commun. 5, 5231 (2014). 
23. Noheda, B. & Cox, D. E. Bridging phases at the morphotropic boundaries of lead 
oxide solid solutions. Phase Transit. 79, 5-20 (2006). 
24. Mandal, P. et al. Morphotropic phase boundary in the Pb-free 
(1 — x)BiTiz/gFe2/gMg3/g03-xCaTiO3 system: tetragonal polarization and 
enhanced electromechanical properties. Adv. Mater. 27, 2883-2889 (2015). 
25. White, R. L. Review of recent work on the magnetic and spectroscopic properties of 
the rare-earth orthoferrites. J. Appl. Phys. 40, 1061-1069 (1969). 
26. Shtrikman, S. & Treves, D. Observation of the magnetoelectric effect in Cr203 
powders. Phys. Rev. 130, 986-988 (1963). 
27. Filoti,G., Kuncser, V., Rosenberg, M., Schinzer, C. & Kemmler-Sack, S. Variable rate 
spin freezing and long range antiferromagnetic order in Biz2FeRhOg. J. Alloy. Comp. 
256, 86-91 (1997). 


Acknowledgements This work was supported by the EPSRC under EP/H000925/1. 
M.J.R. is a Royal Society Research Professor. 


Pwr 


Author Contributions MJ.R. and J.B.C. developed the concept. P.M. carried out the 
materials synthesis, characterization and physical property measurements and analysis, 
H.N. performed the physical property measurements, M.J.P. and J.B.C. performed the 
structural analysis, J.A. analysed the magnetic and magnetoelectric data, P.B. built the 
magnetoelectric measurement equipment, P.S. performed and analysed the Méssbauer 
experiments. P.M. and MJ.R. wrote the first draft, all authors contributed to the 
development of the manuscript and to discussion as the project developed. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to M.J.R. (m.j.rosseinsky@liv.ac.uk) or 
J.B.C. (j.b.claridge@liv.ac.uk). 


©2015 Macmillan Publishers Limited. All rights reserved 


METHODS 

Sample preparation. Powder samples of (1 — x)BiTiq — »y2FeyMgq — yy203-(x) 
CaTiOs, in the compositional range x = 0.15, 0.60 = y = 0.90, were synthesized by 
a conventional solid-state reaction. The binary oxides Bi,O3 (99.99% Alfa Aesar, 
pre-dried at 473K), CaCO; (99.997% Alfa Aesar, pre-dried at 473K), Fe,O3 
(99.998% Alfa Aesar, pre-dried at 473K), TiO. (99.995% Alfa Aesar, pre-dried 
at 473K) and MgCO3*Mg(OH)2‘xH,O (x ~ 3, 99.995% Alfa Aesar, used as 
received) were weighed in stoichiometric amounts and ball milled in ethanol for 
20h. The mixtures obtained after evaporating ethanol were pelletized and calcined 
at 1,208 K for 12h in a platinum-lined alumina crucible. These pellets were then 
re-ground thoroughly and re-pelletized, and subjected to a second calcination at 
1,213 K for 12h in platinum-lined alumina crucibles. The resulting powders were 
found to contain only the target phase with no minority phases visible by PXRD. 
Dense pellets (>95% of crystallographic density) suitable for property measure- 
ments were produced from these powders by the following protocol. First, 2 wt% 
polyvinyl butyral binder and 0.2 wt% MnO, were added to the samples, and this 
mixture was ball-milled for 20 h. Second, the resultant mixture was pelletized 
(8mm diameter) with a uniaxial press, followed by pressing at about 2 x 10° Pa 
in a cold isostatic press. Third, these pellets were loaded into a platinum-lined 
alumina boat. Finally, a programmable tube furnace was used to heat the reaction 
under flowing oxygen to 943 K for 1 h, followed by 1,228 K for 3 h and 1,173 K for 
12h before cooling to room temperature at 5 K min’. The resultant pellets were 
found to contain no minority phases by PXRD. Their densities were measured 
using an Archimedes balance. 

Powder X-ray diffraction (PXRD). All data were collected using a PANalytical 
X’Pert Pro diffractometer in Bragg-Brentano geometry with a monochromated 
Co Ka, source (wavelength 2 = 1.78896 A) and position-sensitive X’Celerator 
detector. Each sample was contained in a back-filled sample holder and rotated 
during the measurement. A programmable divergence slit was used to provide a 
constant illuminated area throughout the angular range. Data were collected in the 
angular range 5° = 20 = 130° in steps of 0.0167°. Pawley refinements were carried 
out using the software package Topas Academic (version 5). For each PXRD 
pattern, background was modelled using a Chebyschev polynomial function with 
12 refined parameters. Lattice parameters, a sample height correction, peak profile 
functions and model-independent peak intensities were refined. Peak profiles were 
modelled with a modified Thompson—Cox-Hastings pseudo-Voigt function. 
When fitting data to a single phase (R3c or P1la cells), a Stephens anisotropic 
strain broadening function was refined. In two-phase (R3c + Pna2,) refinements, 
this function was refined only for the rhombohedral (R3c) phase. 

Electrical measurements. For electric poling, gold was sputtered on both sides of 
thin disks (thickness of 130-160 tm with tolerance of 101m). For P(E) and 
PUND measurements, silver conductive paint (RS Components) was applied on 
both sides of thin disks and cured at 393 K for 10 min. The edges were bevelled by 
approximately 0.2 mm to avoid electrical breakdown. The area of the electrode was 
measured under an optical microscope equipped with a camera and measurement 
software. The disk was loaded in a Radiant high-voltage test fixture. Silicone oil 
was used as a dielectric medium to avoid air breakdown. P(E) measurements were 
conducted using a Radiant ferroelectric tester system and an aixACCT piezo- 
electric evaluation system (aixPES). PUND measurements were carried out using 
the Radiant ferroelectric tester system with a square electric field pulse with 
a delay of 500ms and pulse widths of 5ms (x=0.15, y=0.60) and 8ms 
(x = 0.15, y= 0.80). The remanent polarizations for positive (dP/2) and nega- 
tive (—dP/2) applied electric fields are calculated as dP/2 = (P* — P*)/2 and 
—dP/2 = (—P* — (—P’))/2, respectively, where P* contains both remanent and 
non-remanent polarization, whereas P” contains only the non-remanent polar- 
ization. P* and P; are equivalent polarizations of P* and P’*, respectively, measured 
when the applied electric field is reduced to zero following the pulse. 

Leakage current measurements. Leakage current was measured in an aixPES 
instrument using a triangular waveform in steps of 25 V and with a step duration 
of 2 s. A switching prepolarization pulse was applied before actual measurements. 
Resistivity measurements. Resistivity was measured using the two-probe method 
in a Magnetic Property Measurement System (MPMS) XL-7 SQUID magneto- 
meter (Quantum Design). The pellet was loaded into a modified dc-SQUID probe 
and connected to a Keithley 6430 sub-femptoamp remote sourcemeter. 
Impedance measurements. Impedance and phase angles were measured using an 
Agilent LCR meter E4980 by applying an ac voltage of 0.5 V in the frequency range 
20 Hz to 2 MHz. 

Magnetic measurements. Magnetic measurements were carried out using MPMS 
XL-7 and MPMS3 systems (Quantum Design). For this, powder or pellet samples 


LETTER 


were loaded into a polycarbonate capsule and fixed into a straight plastic drinking 
straw and then loaded into a dc-SQUID probe. The Néel temperature (Ty) was 
determined from peak of dMrpm/dT. The isothermal dc magnetization data 
were decomposed using the general function M(H)= >> m;(H), where m; are 
generic functions describing single magnetic components taking the form 
m,(H) =a tanh (4) +d. Here a represents the saturation magnetization, b 
the coercive field, c is a parameter that describes the squareness of the loop, 
and d is a linear term that includes paramagnetic, diamagnetic and antiferro- 
magnetic contributions for the individual magnetic component. Above the mag- 
netic ordering temperature of the perovskite phase, only one component was 
used to describe the isothermal magnetization assigned to a high Fe content 
impurity. Below the perovskite magnetic ordering temperature, two components 
were used. 
MOssbauer spectroscopy. Absorption mode Méssbauer spectroscopy measure- 
ments were performed at room temperature, using an electromagnetic Doppler 
drive system, a °’Co(Rh) y-ray source with an actual activity of about 20 mCi anda 
Xe-gas Reuter Stokes proportional counter, and Canberra amplification, discrim- 
ination and scaling electronics. Samples were diluted with sucrose (icing sugar), for 
measurements at an approximate ratio of 0.2, to prevent excessive line-shape 
distortion and non-resonant absorption, owing to the high bismuth content of 
the samples. Custom modelling and nonlinear least-squares error minimization 
routines were used for the extraction of the spectroscopic parameters. Isomer 
shifts are reported with respect to the source. 
Magnetoelectric measurements. Details of the magnetoelectric measurements 
set-up** and protocol’* are described elsewhere. Note that the load resistance, 
mentioned and used in ref. 28 to gain a suitable voltage drop from the MPMS 
ac coil power supply, was omitted in our experiments; instead, the MPMS ac coil 
power supply was directly connected to the input of a Krohn-Hite 7600 M wide- 
band power amplifier. In this experiment, a sinusoidal electric field E = E,.cos(wt) 
(where w = 2nf, with f frequency, and E,, is the electric field amplitude) is applied 
across the disk and the first harmonic of the complex ac magnetic moment, 
m(t) =(m'—im") cos (qt) is measured. The measurements were performed in 
the absence of any dc magnetic or electric fields. In this scenario, the real part of 
the electrically induced magnetic moment” is m! = «Eg. a where V is the sample 
volume. This moment involves only the linear magnetoelectric («) effect; the 
higher-order effects are zero. The corresponding electrically induced volume ac 
magnetization is defined as M,. = m'/V. To demonstrate the linear magnetoelec- 
tric effect on x = 0.15, y = 0.60 and x = 0.15, y = 0.80, the electric field amplitude 
E,- was varied and the induced moment was recorded. Linear magnetoelectric 
susceptibility («) was calculated from a plot of volume ac magnetization amplitude 
ac 
a= lo AE. 
formed at f= 1 Hz with 20 blocks to average and 10 scans per measurement. 
The sensitivity of the experimental set-up used here is |m'|=VXM,.> 
5X10 %Am ~. Prior to magnetoelectric measurements, disks were poled 
externally using the aixPES instrument at a field of 100kV cm * for 15 min from 
343 K to room temperature. Disks were then loaded into a modified dc-SQUID 
probe at 300 K and subjected to a magnetic field of 2T for 30 min. After the 
removal of the electric and magnetic fields, electrodes were short circuited for 
15 min before conducting magnetoelectric measurements at 300 K. For magneto- 
electric measurements at 10 K and 150 K (for x = 0.15, y = 0.60), the sample was 
cooled down to the measurement temperature in the presence of an electric field 
(3.5kVcm ')anda magnetic field (2 T) and the protocol for 300 K measurement 
was followed. To determine the temperature dependence of «, an electric field 
(3.5kV cm! for y = 0.60 and 2.7kV cm ' for y = 0.80) and a magnetic field of 
2T were applied at 300 K, followed by cooling to 10 K at a rate of 1 K min’ '; the 
data were collected at 1 Hz. The temperature was stabilized for 5 min at each step 
before measurement. The room-temperature bulk dc resistivity of x= 0.15, 
y =0.60 is 3.3 X 10!*Qcm, and that of x=0.15, y= 0.80 is 2.1 X 10’*Qcm. 
The leakage currents observed for y= 0.60 and y= 0.80 are 0.35nA (320K) 
and 11.4nA (360K), respectively, at the maximum measurement fields. These 
values are too low to cause any artefacts in the magnetoelectric measurements. 
The upper limit of temperature in this measurement set-up is 360 K. 


Mac (=m'/V) versus Eac: 


(ref. 29). All measurements were per- 


28. Borisov, P., Hochstrat, A., Shvartsman, V. V. & Kleemann, W. Superconducting 
quantum interference device setup for magnetoelectric measurements. Rev. Sci. 
Instrum. 78, 106105 (2007). 

29. Schmid, H. Some symmetry aspects of ferroics and single phase multiferroics. 
J. Phys. Cond. Matter 20, 434201 (2008). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


0 50 100 150 200 250 300 350 4S 2 0 1 2 3 4 
T (K) HoH (T) 

Extended Data Figure 1 | Magnetic properties of composition x= 0.15, remanent magnetization in zero applied field (TRM, blue line). Note negative 

y= 0.25. Left, magnetization versus temperature, cooled in zero applied field © TRM curve is due toa negative remanent magnetic field in the superconducting 

(ZEC, black line), cooled in 1 mT applied field (FC, red line) and the thermal _— magnet. Right, magnetization versus magnetic field at 100 K. 


©2015 Macmillan Publishers Limited. All rights reserved 


yeoo| | | Lad 
y = 0.85 Lib NAAN mA 


veo} | Latimer 


yor! tf | ba 


y=0.70 |_ tJ Lok AANA RA 


| | A h NAA ape 


20 40 60 80 100 120 
26 (°) 


Extended Data Figure 2 | PXRD patterns obtained from six compositions of 
the series (1 — x)BiTig — y2FeyMgq — y/203-(x)CaTiO; where x= 0.15, 0. 
60 = y=0.90. The weak reflection marked with the + symbol, which is 
visible in the y = 0.70 and y = 0.75 patterns, corresponds to the most intense 
reflection of sillenite (Biz;FeO4o). All other peaks are indexed to the target 
perovskite phase using rhombohedral, rhombohedral + orthorhombic, or 
monoclinic cells, as discussed in the text. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


a 
2 
i 
| 
2 
x 
2 
c 
> 
jo) 
O 
l 0 
T r T T T T 
20 40 60 80 100 120 
20 (’) 
b 
2 
c 
> 
2 
& 
2 
c 
S 
fe) 
O 
_ Li hilt uh Hh Aah im hi Hb oth om) wa! 
20 40 60 80 100 120 
26 (°) 
c 
2 
2 
c 
> 
£ 
& 
2 
c 
=| 
fe) 
O 
PE CE A 


20 40 60 80 
26 (°) 


100 120 


Extended Data Figure 3 | Pawley fits to PXRD patterns collected from two 
compositions of the series (1 — x)BiTig — y2FeyMgy — y/203-(x)CaTiO3. 
a-f, x = 0.15, y = 0.60 (a-c) and x = 0.15, y = 0.80 (d-f) modelled as a single 
rhombohedral phase in space group R3c (a, d), as a combination of 
rhombohedral (R3c) and orthorhombic (Pna2,) phases (b, e) and as a single 


d 
2- 
e 
c 
=) 
@ 1 
& 
a 
c 
=) 
fe) 
O 
0-7 
, | Je Ul | aL el itl Pe Mt Ul ul i 
20 40 60 80 100 120 
26 (°) 
e 
24 
2 
¢ 
=) 
3 1 
aus 
£ 
Cc 
a] 
fe) 
O 
0- 
a 
| 1 stil em | Vi 1 | | | 
hil Uae, Ul WL Hie MLE LU CU ote wu 
20 40 60 80 100 120 
26 (°) 
f 
24 
ge 
c 
Ss 
é 1] 
& 
£2 
Cc 
=) 
fe) 
oO 
0-4 
POR ROE CUM TOOL 


20 40 60 80 
20 (°) 


T 
100 120 


monoclinic phase in space group P11a, which is a subgroup of R3c and Pna2, 
(c, f). Black circles, yop5; red line, ycaic3 teal line, (Yobs — Ycaic); blue markers, hkl 
(R3c) reflections; green markers, hkl (Pna2,) reflections; magenta markers, hkl 
(P11a) reflections. Insets are zooms of the main plots. 


©2015 Macmillan Publishers Limited. All rights reserved 


130 


110 


100 


90 


10' 107 10° 


100 


dP/2 = (P' - P’y/2 
-dP/2 = (-P' - (-P’))/2 


Pulse sequence 


Extended Data Figure 4 | Dielectric, polarization and leakage 
characteristics. a, Frequency dependence of dielectric permittivity (left axis, 
dashed line) and loss (right axis, solid line) at 300 K for x = 0.15, y = 0.60 
(black) and x = 0.15, y = 0.80 (red). b, A typical P(E) loop (right axis, blue line) 
with the corresponding current density (Jpg; left axis, black line) and the 
leakage current density (J;; left axis, red line) for x = 0.15, y = 0.80. c, The 


LETTER 


-150 -100 -50 0 50 100 150 
E (kV.cm’) 
d 10% 
10°? 
E 
6 
Gq 
40" 
10"° 
280 300 320 340 360 380 
T (K) 


polarization (blue line, left axis) and electric field profile (red dotted line, 
right axis) from PUND measurement of x = 0.15, y = 0.80 (see Methods for 
details). d, Temperature dependence of dc resistivity of x = 0.15, y = 0.80, 
showing highly insulating behaviour. In a-c, the arrows point to the relevant 
axis for each curve. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


e Data 
Perovskite contribution 
- — — Spinel contribution 
Total contribution 


b 
e Data 
Perovskite contribution 
ee - — — Spinel contribution 
‘oO Total contribution 
~~ 
E 
= 
c 
2 
os) 
2 
@ 
es 
D 
i) 
= 
c 
°o x=0.15, y =0.6 at 300 K 
O x=0.15, y =0.8 at 395 K 
__ Fit 
“oD - — - Extracted impurity 
=< x = 0.15, y = 0.8 at 300 K 
= —-—- Extracted impurity 
<x x = 0.15, y= 0.6 at 10 K 
= 
2 
ro) 
2 
@ 
c 
D 
i) 
= 


Extended Data Figure 5 | Isothermal magnetization M(H). a, b, x = 0.15, 
y = 0.60 at T= 10 K< Ty (a) and x = 0.15, y = 0.80 at T = 300K < Ty (b). 
The experimental data are represented as black filled circles. Red lines show the 
sum of the perovskite phase (blue line) and spinel impurity phase (green dashed 
line) contributions. c, x = 0.15, y = 0.60 at T= 300 K > Ty and x = 0.15, 

y = 0.80 at T= 395 K> Ty. The experimental data are represented as open 
circles (x = 0.15, y = 0.60) or squares (x = 0.15, y = 0.80); green dash-dotted 
and dashed lines show extracted spinel impurity contributions for x = 0.15, 

y = 0.60 and x = 0.15, y = 0.80, respectively; red lines show fits to the data. 


©2015 Macmillan Publishers Limited. All rights reserved 


M (A.m’.kg") 


M (A.m’.kg") 


Extended Data Figure 6 | Thermal remanent magnetization data. 

a, b, Thermal remanent magnetization (TRM; left axis, black circles) and 
derivative of TRM with respect to temperature (dMypy/dT; right axis, blue 
lines) for x = 0.15, y = 0.60 (a) and x = 0.15, y = 0.80 (b). Arrows indicate 
the axis that each dataset corresponds to. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


0 
= -10 
E 
x 
- 
= 200) 
= T=10K 

-30 

0 1 2 3 4 
E (kV.cm’) 


Extended Data Figure 7 | Linear magnetoelectric effect for x= 0.15, 
y= 0.60 at 10 K. Red squares are mean values, error bars in red are standard 
errors from 10 repeated measurements. The blue line is a linear fit to the data. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


b 
30 
10 
15 
5 
E § 
Oo 0 oO 0 
cr 1 
a a 
5 
-15 
-10 
-30 
-100 -50 0 50 100 
E (kV.cm") 


E (kV.cm’) 
Extended Data Figure 8 | P(E) measurements above room temperature. a, b, Measurements for x = 0.15, y = 0.60 at frequency f= 100 Hz (a) and x = 0.15, 


y = 0.80 at frequency f= 150 Hz (b) at 473K. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 


Refined lattice parameters and agreement factors from Pawley fits to PXRD data 


Refined Lattice Parameters (space group P11a) Weighted Profile R-factor (Rwp) Goodness of fit va) 
Composition 
° . 7 . 23 R3c + R3c+ 
a(A) b(A) c(A) v(°) Volume (A’) R3c Pna2y P1la R3c Pna2, P1la 
x=0.15, y= 
0.60 5.6037(3) 7.9047(6) 5.5666(1)  89.433(7)  246.56(2) 7.421 6.373 6.169 2.104 1.583 1.499 
x=0.15, y= 
0.80 5.6019(7) 7.903(1) 5.5641(1) 89.40(1) 246.32(5) 6.883 6.136 5.969 1.821 1.477 1.392 


Refined lattice parameters (a, b, c, y) and the corresponding unit cell volumes, obtained by fitting toa V2ap x 2ap x V2ap unit cell in space group P11a, and agreement factors (Rwp, and 7°) from Pawley fits to PXRD 
data, fitted in three different candidate space groups, for compositions x = 0.15, y= 0.60 and x = 0.15, y= 0.80. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 2 | Spectroscopic parameters from Mossbauer data fitting of x = 0.15, y = 0.80 at 300 K 


Component Area (%) & (mm.s") Q(mm.s”) By (T) 
1 30.8(1) 0.29(5) 0.033(7) 46(1) 
2 67.8(1) 0.31(5) -0.003(2) 23(4) 
3 0.6(1) 0.30(5) -0.063(4) 33.5(5) 
4 0.7(1) 0.30(5) -0.32(1) 41.7(5) 


The area, isomer shift (6), electric quadrupole moment (Q) and hyperfine field (Bhs) for different components, extracted from a multicomponent fit, with standard errors in parentheses. 


©2015 Macmillan Publishers Limited. All rights reserved 


Mae A A lea 


doi:10.1038/nature15371 


The contribution of outdoor air pollution sources to 
premature mortality on a global scale 


J. Lelieveld’?, J. S. Evans***, M. Fnais®, D. Giannadaki? & A. Pozzer! 


Assessment of the global burden of disease is based on epidemiolo- 
gical cohort studies that connect premature mortality to a wide 
range of causes’°, including the long-term health impacts of ozone 
and fine particulate matter with a diameter smaller than 2.5 micro- 
metres (PM, 5)*°. It has proved difficult to quantify premature mor- 
tality related to air pollution, notably in regions where air quality is 
not monitored, and also because the toxicity of particles from vari- 
ous sources may vary’®. Here we use a global atmospheric chemistry 
model to investigate the link between premature mortality and seven 
emission source categories in urban and rural environments. In 
accord with the global burden of disease for 2010 (ref. 5), we 
calculate that outdoor air pollution, mostly by PM,.;, leads to 3.3 
(95 per cent confidence interval 1.61-4.81) million premature deaths 
per year worldwide, predominantly in Asia. We primarily assume 
that all particles are equally toxic’, but also include a sensitivity study 
that accounts for differential toxicity. We find that emissions from 
residential energy use such as heating and cooking, prevalent in India 
and China, have the largest impact on premature mortality globally, 
being even more dominant if carbonaceous particles are assumed 
to be most toxic. Whereas in much of the USA and in a few other 
countries emissions from traffic and power generation are important, 
in eastern USA, Europe, Russia and East Asia agricultural emissions 
make the largest relative contribution to PM,,5, with the estimate of 
overall health impact depending on assumptions regarding particle 
toxicity. Model projections based on a business-as-usual emission 
scenario indicate that the contribution of outdoor air pollution to 
premature mortality could double by 2050. 

Air pollution is associated with many health impacts, including 
chronic obstructive pulmonary disease (COPD) linked to enhanced 
ozone (O3), and acute lower respiratory illness (ALRI), cerebrovascu- 
lar disease (CEV), ischaemic heart disease (IHD), COPD and lung 
cancer (LC) linked to PM: 5 (ref. 8). Many previous studies have 
been based on air quality measurements, largely focusing on urban 
pollution**''"*, Atmospheric chemistry and transport models have 
been used to account for other environments, including those for 
which no measurement data are available’>. 

Recently, enhanced resolution regional and global models and 
satellite data have been applied to improve estimates of PM, 5 and 
O3 concentrations and their impact on air quality’ **. Here we present 
results obtained with an atmospheric chemistry-general circulation 
model, applied at high resolution to compute global air quality 
changes, combined with population data, country-level health statist- 
ics and pollution exposure response functions (Methods). Our calcu- 
lations of air pollution related mortality are based on the method of the 
global burden of disease (GBD) for 2010 (ref. 5), applying improved 
exposure response functions that more realistically account for 
health effects at very high PM,.; concentrations compared to former 
assessments*. This is particularly relevant for some parts of the world 
where air pollution has increased nearly unabated and for future scen- 
arios that project the continued growth of emissions. Following the 


GBD* we also include desert dust (which is largely natural) with PM, 5; 
hence strictly speaking we assess the effects of atmospheric composi- 
tion. 

The air quality guidelines of the World Health Organization 
(WHO) and national regulatory policies are based on exposure res- 
ponse functions that rely on PM2.5 mass concentrations, implicitly 
treating all fine particles as equally toxic without regard to their source 
and chemical composition. However, expert elicitation suggests that 
carbonaceous particles are more toxic than crustal material, nitrates 
and sulfates’. A recent study” finds that PM, ; from coal combustion 
leads to increased mortality risk from cardiovascular disease and LC, 
but that the evidence is much weaker for other sources, whereas esti- 
mates using non-specific PM, ; mass alone may underestimate the total 
effect of PM>.5 on mortality. Further, this study did not find support for 
mortality from biomass combustion and soil dust particles”. However, 
this and a subsequent report by the Health Effects Institute in the USA 
also note that there were only a limited number of cities in these 
investigations where these sources and components were likely to be 
measured consistently**’’. While the evidence for differential toxicity is 
far from conclusive, we conducted a secondary analysis assuming that 
carbonaceous PM, 5 is five times more toxic than inorganic particles, 
though maintaining the same overall health impact of PM) 5. 

We have calculated premature mortality linked to CEV, COPD, IHD 
and LC for adults =30 years old, and ALRI for infants <5 years old 
(Table 1 and Extended Data Tables 1 and 2). Our estimate of the global 
PM, ; related mortality in 2010 is 3.15 million people with a 95% con- 
fidence interval (CI95) of 1.52-4.60 million. The main causes are CEV 
(1.31 million) and IHD (1.08 million), and secondary causes are COPD 
(374 thousand), ALRI (230thousand) and LC (161 thousand). Our 
global estimate of O3 related mortality by COPD is 142 (C195: 90- 
208) thousand. Our total estimate of 3.30 (CI95: 1.61-4.81) million 
people in 2010 agrees closely with the GBD®. This is in addition to 
the estimated 3.54 million deaths per year caused by indoor air pol- 
lution due to use of solid fuels for cooking and heating’. Figure 1 shows 
the geographic distribution and demonstrates the locations of hotspots 
in China, India and many of the large urban centres. 

Considering the global population of 6.8 billion in 2010, it follows 
that the mean per capita mortality attributable to air pollution is about 
5 per 10,000 person-years. Of these 5 persons per 10,000 worldwide, 
about 2 die by CEV, 1.6 by IHD, 0.8 by COPD, 0.35 by ALRI and 0.25 by 
LC. The highest per capita mortality is found in the Western Pacific 
region, followed by the Eastern Mediterranean and Southeast Asia. The 
combination of high per capita mortality with high population density 
explains the (by far) highest number of deaths in the Western Pacific, 
China being the main contributor (1.36 million per year). Note that the 
mortality attributable to air pollution in China is approximately an 
order of magnitude higher than that attributable to Chinese road trans- 
port injuries and HIV/AIDS, and ranks among the top causes of 
death*. Southeast Asia has the second highest premature mortality, 
where India is the main contributor (0.65 million per year). The global 


Max Planck Institute for Chemistry, Atmospheric Chemistry Department, 55128 Mainz, Germany. *The Cyprus Institute, Energy, Environment and Water Research Center, 1645 Nicosia, Cyprus. ?Harvard 
School of Public Health, Boston, Massachusetts 02215, USA. *Cyprus International Institute for Environment and Public Health, Cyprus University of Technology, 3041 Limassol, Cyprus. "King Saud 


University, College of Science, Riyadh 11451, Saudi Arabia. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 367 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Table 1 | Premature mortality related to PM2.5 and Os for the population <5 and =30 years old 


WHO region Year Population (x 10°) Mortality attributable to air pollution (deaths x 10%) 
PMoas O03 Total 
ALRI <5 yr IHD =30 yr CEV = 30 yr COPD = 30 yr LC =30 yr COPD = 30 yr 
Africa 2010 809 90 55 it 1 2 2 23) 
2050 807 158 185 262 38 5 12 660 
Americas 2010 930 0 44 8 4 7 5 68 
2050 il bl 0 75 15 i 11 11 119 
Eastern Mediterranean 2010 602 56 115 86 2 5 12 286 
2050 021 66 321 246 37 13 40 723 
Europe 2010 867 1 239 95 3 ei 6 381 
2050 886 1 307 156 8 37 11 530 
Southeast Asia 2010 762 64 327 250 124 15 82 862 
2050 2,332 104 865 807 419 48 227 2,470 
Western Pacific 2010 812 19 299 794 209 107 35 1,463 
2050 861 16 413 1,120 309 155 a7 2,070 
World 2010 6,783 230 1,079 1,311 374 161 142 3,297 
2050 9,098 346 2,166 2,604 828 270 358 6,572 


Regions are defined by the World Health Organization, see Extended Data Table 1. Results for 2050 are based on a business-as-usual scenario. 


mortality linked to air pollution is strongly influenced by these high 
numbers in Asia. 

We determined the impacts of seven source categories by sub- 
tracting them one by one from the emissions in our model. These 
sensitivity calculations show the efficacy of individually controlling 
these sources. The 15 countries with highest premature mortality 
attributable to air pollution in 2010 are listed in Table 2 along with 
the contribution of each source category. Residential and commercial 
energy use (RCO) is the largest source category worldwide, contrib- 
uting nearly one-third, and almost a factor of 2 more under the alterna- 
tive assumption of differential toxicity. Note that this only refers to 
mortality by outdoor exposure to this source. Our estimate of 1.0 mil- 
lion deaths per year by RCO is in addition to the 3.54 million deaths 
per year due to indoor air pollution from essentially the same source’. 

The next largest anthropogenic source category is agriculture 
(AGR), contributing one-fifth; however, this reduces significantly 
under the assumption of differential particle toxicity. The successive 
principal anthropogenic categories are power generation (PG), indus- 
try (IND), biomass burning (BB) and land traffic (TRA), and taken 
together they cause nearly one-third of all air pollution mortality. If 
carbonaceous particles are five times more toxic than sulfates and 
nitrates, these sources together account for one-quarter of the mortal- 


ity. Natural sources make up for the remaining one-sixth of the total. 
However, if crustal material is five times less toxic than carbonaceous 
PM, ; this reduces considerably. The most important source category 
in each region in 2010 is shown in Fig. 2. 

RCO is foremost in the populous parts of Asia. It refers to small 
combustion sources, especially biofuel use (for heating and cooking), 
and also waste disposal and diesel generators. In China it contributes 
about 32%, in India, Bangladesh, Indonesia and Vietnam 50-60%, 
while in Nepal it is highest with nearly 70% (Extended Data Table 3). 
In western countries it is typically 5-10%, although in France and 
Poland it contributes about 15%. The contribution of this pollution 
source to mortality is sensitive to toxicity assumptions and large uncer- 
tainty related to IHD. Because of the comparatively large fraction of 
carbonaceous PM, 5, under our alternative calculations where these 
aerosols are five times more toxic, RCO increases from 31% to 59% 
of global air pollution mortality. If, on the other hand, we assume that 
RCO does not contribute to IHD mortality, this fraction decreases from 
31% to 26% (Methods). 

Agriculture (AGR) has a remarkably large impact on PM) 5, and is 
the leading source category in Europe, Russia, Turkey, Korea, Japan and 
the Eastern USA (Fig. 2). In many European countries, its contribution 
is 40% or higher. Agricultural releases of ammonia (NH3) from 


(Wy OO} x Wy OO] JO Bese ed syyeep) ApeLOW\ 


100° E 


60° N 
30° N 
oO 

xe} 

2 

Ho 
30°S 
60° S 

100° W o° 
Longitude 


Figure 1 | Mortality linked to outdoor air pollution in 2010. Units of mortality, deaths per area of 100 km X 100 km (colour coded). In the white areas, annual 
mean PM, ; and O; are below the concentration-response thresholds where no excess mortality is expected. 


368 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Table 2 | Top 15 ranked countries of premature mortality linked to outdoor air pollution in 2010 


Country Deaths (10°) Residential energy Agriculture Natural Power generation Industry Biomass burning Land traffic 
China 1,357 32 (76) 29 (7) 9 (3) 18 (7) 8 (3) 1(2) 3 (2) 
India 645 50 (77) 6 (1) Lig) 14 (5) 7 (3) 7 (9) 5 (4) 
Pakistan 111 31 (67) 2 (1) 57 (23) 2 (1) 2 (2) 2 (3) 3 (3) 
Bangladesh 92 55 (78) 10 (2) 0 (0) 15 (6) 7 (2) 7 (8) 6 (4) 
Nigeria 89 14 (31) 1 (0) 77 (52) 0 (0) 0 (0) 8 (16) 0 (0) 
Russia 67 7 (18) 43 (26) 1 (0) 22 (17) 8 (5) 8 (21) 11 (13) 
USA 55 6 (12) 29 (17) 2 (2) 3119) 6 (5) 59) 21 (36) 
Indonesia 52 60 (64) 2 (0) 0 (0) 5 (3) 4 (2) 27 (29) 2 (2) 
Ukraine 51 6 (13) 52 (32) 0 (0) 18 (17) 9(7) 5 (18) 10 (13) 
Vietnam 44 51 (74) 12) 0 (0) 13 (4) 8 (3) 12 (14) 4 (3) 
Egypt 35 1(2) 3 (3) 92 (88) 2 (2) 1(1) 0(1) (3) 
Germany 34 8 (17) 45 (26) 0 (0) 13 (10) 13 (8) 1 <3) 20 (36) 
Turkey 32 9 (20) 29 (19) 15 (6) 19 (14) 11 (8) 6 (19) (14) 
lran 26 1(3) 6 (6) 81 (75) 4 (4) 3 (3) 1(2) 4 (7) 
Japan 25 12 (29) 38 (22) 0 (0) 17 (15) 18 (14) 5 (8) 10 (12) 
World 3,297 31 (59) 20 (7) 18 (11) 14 (7) 78) 5 (8) 5 (5) 


Columns 3-9 show contributions (%) of the seven main source categories, the leading one in bold. For details and additional countries, see Extended Data Table 3. In parentheses are shown sensitivity calculations 


with carbonaceous particles having a five times larger impact than inorganic aerosol compounds. 


fertilizer use and domesticated animals affect air quality through several 
multiphase chemical pathways, forming ammonium sulphate and 
nitrate. Since NH; abundance is often limiting in PM, formation, 
reduction of its emissions can make an important contribution to air 
quality control’. As agricultural emissions mostly form inorganic 
PM, ;, the impact on mortality diminishes under the assumption that 
carbonaceous PM; is five times more toxic. 

Natural sources (NAT) contribute strongly to mortality, being dom- 
inant in northern Africa and the Middle East, and also a leading category 
in Central Asia (Table 2 and Fig. 2). Although we categorize airborne 
desert dust as natural, a fraction is anthropogenic due to the role of 
humans in desertification and agricultural practices*®. The chronic health 
and mortality impacts associated with exposure to dust are more uncer- 
tain than those due to typical air pollution in industrialized countries 
where most of the epidemiological cohort studies have been carried out. 
If all fine particles are equally toxic, then natural sources are responsible 
for about one-sixth of air pollution mortality. If fine carbonaceous part- 
icles are five times more toxic than crustal material, then natural sources 
account for only about one-tenth of air pollution induced mortality. 

Power generation (PG) by fossil fuel fired power plants is the third 
largest anthropogenic source category, being an important source of 


SO, and NO,, which are converted to sulfate and nitrate in the atmo- 
sphere. It accounts for about one-seventh of population exposure to 
PM, ; and O3. Power plant emissions are quite important in the USA 
(>30%) and in Russia, Korea and Turkey (roughly 20%). Emissions 
from power generation also have particularly large impacts on fine 
particle concentrations in the Middle East, but frequently these go 
unnoticed as they are masked by desert dust. The role of this source 
is sensitive to the assumed PM, ; toxicity, reducing by a factor of 2 if 
sulfate and nitrate are five times less toxic than carbonaceous PM) >. 

Industry (IND) is among the smaller source categories, with a global 
fraction of about 7% (Table 2); nevertheless, it contributes about twice 
this percentage in most of the western world. It includes iron and steel, 
chemical, pulp and paper, food, solvent and other manufacturing sec- 
tors, oil refineries and fuel production. This source of air pollution is 
generally significant in industrialized countries and emerging eco- 
nomies, but rarely the leading cause of premature mortality. Under 
the differential toxicity assumption, its contribution to mortality 
would reduce by more than a factor of 2. 

Our calculations suggest that land traffic (TRA) emissions are 
responsible for about one-fifth of mortality by ambient PM, 5 and 
O; in Germany, the UK and the USA, while globally they account 


60°N IND 
TRA 
30°N 
RCO 
n 
fe} 
= 
3 
‘‘ o 
ne} BB © 
2 0° a 
: E 
PG 8 
30°S AGR 
NAT 
60° S T T T T T T T T 
100° W o° 100° E 
Longitude 


Figure 2 | Source categories responsible for the largest impact on mortality 
linked to outdoor air pollution in 2010. Source categories (colour coded): 
IND, industry; TRA, land traffic; RCO, residential and commercial energy use 


(for example, heating, cooking); BB, biomass burning; PG, power generation; 
AGR, agriculture; and NAT, natural. In the white areas, annual mean PM; 5 is 
below the concentration-response threshold. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 369 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


for about 5%. Because emissions of NO, are the dominant source of 
traffic-related PM; 5 in the form of nitrate, together with carbonaceous 
PM, ;, the results from our alternative calculations—assuming carbona- 
ceous particles are five times more toxic than nitrates and other inor- 
ganics—also indicate a 5% contribution, globally. Note that this 
contribution is likely to be a lower limit as traffic also emits other pollu- 
tants that are not included or influential on PM; (ref. 31) (Methods). 

Biomass burning (BB) is also a relatively small source category with 
a global contribution of about 5%. Nevertheless, its areal range is large, 
for example in South America and Africa. It is the main source of air 
pollution in large parts of Canada, Siberia, Africa, South America and 
Australia. Because in many parts of these countries annual mean PM;.5 
is below the concentration-response threshold (Methods), these areas 
are shown white in Fig. 2. Biomass burning is also widespread in 
southeastern Asia, although in populous parts of Vietnam and 
Indonesia (for example, Java) residential energy use is larger and there- 
fore the leading category (Table 2). 

In the Southern Hemisphere biomass burning is generally the lead- 
ing contributor to PM, ;, with some exceptions. In Brazil it contributes 
about 70%, and in many African countries its impact can also be high, 
up to >90% in Angola. Note that the health impacts of PM; from 
biomass burning are quite uncertain, especially the attribution of IHD 
related mortality, due to a dearth of epidemiological cohort studies in 
regions where this pollution source predominates (Methods). Our 
calculations suggest that it is responsible for between 5% (equal tox- 
icity) and 8% (differential toxicity) of air pollution induced mortality. 

To understand how the premature mortality attributable to air 
pollution may develop in the coming decades, we applied a busi- 
ness-as-usual (BaU) emission scenario for the years 2025 and 2050, 
assuming that only currently agreed legislation is implemented that 
will affect future emissions”. Thus air quality and emission standards 
are fixed. Results for 2050 are presented here, and for 2025 in Extended 
Data Fig. 2 and Extended Data Tables 4, 5. Under the BaU scenario, 
moderate though significant increases of premature mortality will 
occur in Europe and the Americas, to a large degree in urban areas. 
Large increases are projected in Southeast Asia and the Western 
Pacific, leading to a global growth of premature mortality to 6.6 
(C195: 3.4-9.3) million (+100%) in 2050 (Table 1). This compares 
to a negligible population increase of infants (<5 years old), and a 
substantial increase (+68%) among people =30 years old in 2050 
(implying an ageing population). Globally, the per capita mortality 
is projected to increase from 5 per 10,000 person-year in 2010 to about 
7 per 10,000 person-year in 2050. The mortality attributable to air 
pollution will continue to be dominated by Asia with an unchanged 
fraction of about 75%. 

The urban population is expected to grow relatively rapidly from 
3.6 billion in 2010 to 5.2 billion in 2050, and combined with increasing 
air pollution concentrations the health impacts will escalate. Our 
estimate of urban premature mortality by outdoor air pollution in 
2010 is 2.0 million, increasing to 4.3 million in 2050, representing 
60% of the global total in 2010 and 65% in 2050. Urban population 
growth is responsible for part of this change, but the levels of air 
pollution in urban areas are also projected to grow rapidly. This is 
evident from our finding that the per capita mortality attributable to 
air pollution in 2010 is about 50% higher in urban than in rural envir- 
onments. Under the BaU scenario this difference is expected to 
increase to nearly 90% in 2050. 

Recently, much emphasis has been placed on rapidly emerging 
megacities (Methods). We calculate that 17 megacities and conurba- 
tions in Asia rank among the top 30 in terms of premature mortality 
worldwide, the leading one being the Pearl River Delta. When viewed 
instead from the perspective of individual risk, Tianjin and Beijing 
rank highest (Extended Data Table 6). While the per capita mortality 
attributable to air pollution is already extraordinary in Chinese mega- 
cities, according to the BaU scenario it will become even higher in 
Chinese and also Indian megacities by 2050. The combined premature 


370 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


mortality in the 30 largest conurbations accounts for about 7% of 
the worldwide burden of air pollution, indicating the relevance of all 
urban areas. 

Our results suggest that if the projected increase in mortality attrib- 
utable to air pollution is to be avoided, intensive air quality control 
measures will be needed, particularly in South and East Asia. The 
poorly characterized uncertainty about the relative toxicity of various 
classes of particles such as sulfates, nitrates, organics, crustal materials, 
black carbon, and especially smoke from biomass combustion, limits 
unambiguous attribution of sources. Nevertheless, our study suggests 
that emissions from residential energy use should be considered in air 
pollution control strategies and, if all fine particles are equally toxic, the 
reduction of agricultural emissions would improve air quality. An 
improvement in the efficacy of air pollution controls requires a better 
understanding of the relative toxicity of particles from various emis- 
sions sources. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 10 May 2014; accepted 27 July 2015. 


1. Murray, C. & Lopez, A. D. The Global Burden of Disease: A Comprehensive 
Assessment of Mortality and Disability from Diseases, Injuries, and Risk Factors in 1990 
and Projected in 2020 (Harvard Univ. Press, 1996). 

2.  Ezzati, M. et a/. Selected major risk factors and global and regional burden of 
disease. Lancet 360, 1347-1360 (2002). 

3. Ostro, B. Outdoor Air Pollution: Assessing the Environmental Burden of Disease at 
National and Local Levels (World Health Organization Environmental Burden of 

Disease Series No. 5, WHO, Geneva, 2004). 

4. Cohen, A. J. et al. The global burden of disease due to outdoor air pollution. 

J. Toxicol. Environ. Health A 68, 1301-1307 (2005). 

5. Lim, S. S. et al. A comparative risk assessment of burden of disease and injury 

attributable to 67 risk factors and risk factor clusters in 21 regions, 1990-2010: a 

systematic analysis for the Global Burden of Disease Study 2010. Lancet 380, 

2224-2260 (2012); correction 381, 628 (2013). 

6. Pope,C.A. Ill & Dockery, D. W. Health effects of fine particulate air pollution: lines 

hat connect. J. Air Waste Manag. Assoc. 56, 709-742 (2006). 

7. Beelen, R. et al. Effects of long-term exposure to air pollution on natural-cause 

mortality: an analysis of 22 European cohorts within the multicentre ESCAPE 

project. Lancet 383, 785-795 (2014). 

8. Burnett, R. T. et a/. An integrated risk function for estimating the Global Burden of 

Disease attributable to ambient fine particulate matter exposure. Environ. Health 
Perspect. 122, 397-403 (2014). 

9. Jerrett, M. et al. Long-term ozone exposure and mortality. N. Engl. J. Med. 360, 
1085-1095 (2009). 

10. Tuomisto,J.T., Wilson, A., Evans, J.S. & Tainio, M. Uncertainty in mortality response 

to airborne fine particulate matter: combining European air pollution experts. 

Reliab. Eng. Syst. Saf. 93, 732-744 (2008). 

11. Pope, C. A. Ill et al Lung cancer, cardiopulmonary mortality, and long-term 

exposure to fine particulate air pollution. J. Am. Med. Assoc. 287, 1132-1141 

(2002). 

12. Priiss-Ustiin, A., Bonjour, S. & Corvalan, C. The impact of the environment on 
health by country: a meta-synthesis. Environ. Health 7, http://dx.doi.org/10.1186/ 
1476-069X-7-7 (2008). 

3. Russell, A.G. & Brunekreef, B.A focus on particulate matter and health. Environ. Sci. 
Technol. 43, 4620-4625 (2009). 

14. Gurjar, B. R. et a/. Human health risks in megacities due to air pollution. Atmos. 

Environ. 44, 4606-4613 (2010). 

15. West, J. J., Fiore, A. M., Horowitz, L. W. & Mauzerall, D. L. Global health benefits of 

mitigating ozone pollution with methane emission controls. Proc. Nat! Acad. Sci. 

USA 103, 3988-3993 (2006). 

16. Duncan, B. N. et al. The influence of European pollution on ozone in the Near East 

and northern Africa. Atmos. Chem. Phys. 8, 2267-2283 (2008). 

17. Liu, J., Mauzerall, D. L. & Horowitz, L. W. Evaluating inter-continental transport of 

fine aerosols: (2) Global health impact. Atmos. Environ. 43, 4339-4347 (2009). 

18. Anenberg, S. C., Horowitz, L. W., Tong, D. Q. & West, J. J. An estimate of the global 

burden of anthropogenic ozone and fine particulate matter on premature human 

mortality using atmospheric modeling. Environ. Health Perspect. 118, 1189-1195 

(2010). 

19. Fann, N. et al. Estimating the national public health burden associated with 
exposure to ambient PM2.5 and ozone. Risk Anal. 32, 81-95 (2012). 

20. Silva, R. A. et a/. Global premature mortality due to anthropogenic outdoor air 
pollution and the contribution of past climate change. Environ. Res. Lett. 8, http:// 
dx.doi.org/10.1088/1748-9326/8/3/034005 (2013). 

21. Lelieveld,J., Barlas, C., Giannadaki, D. & Pozzer, A. Model calculated global, regional 
and megacity premature mortality due to air pollution by ozone and fine 
particulate matter. Atmos. Chem. Phys. 13, 7023-7037 (2013). 


©2015 Macmillan Publishers Limited. All rights reserved 


22. 


23. 


24. 


25. 


26. 


27. 


28. 
29. 


Giannadaki, D., Pozzer, A. & Lelieveld, J. Modeled global effects of airborne desert 
dust on air quality and premature mortality. Atmos. Chem. Phys. 14, 957-968 
(2014). 

van Donkelaar, A. et al. Global estimates of ambient fine particulate matter 
concentrations from satellite-based aerosol optical depth: development and 
application. Environ. Health Perspect. 118, 847-855 (2010). 

Brauer, M. et al. Exposure assessment for estimation of the Global Burden of 
Disease attributable to outdoor air pollution. Environ. Sci. Technol. 46, 652-660 
(2012). 

Thurston, G. D. et al. in National Particle Component Toxicity (NPACT) Initiative: 
Integrated Epidemiologic and Toxicologic Studies of the Health Effects of Particulate 
Matter Components (eds Lippmann, M. et al.) 127-166 (Health Effects Institute 
Research Report 177, Boston, 2013). 

Lippmann, M., et a/. (eds) National Particle Component Toxicity (NPACT) Initiative: 
Integrated Epidemiologic and Toxicologic Studies of the Health Effects of Particulate 
Matter Components (Health Effects Institute Research Report 177, Boston, 2013). 
Vedal, S. et a/. National Particle Component Toxicity (NPACT) Initiative: Report on 
Cardiovascular Effects (Health Effects Institute Research Report 178, Boston, 
2013). 

Yang, G. et al. Rapid health transition in China, 1990-2010: findings from the 
Global Burden of Disease Study 2010. Lancet 381, 1987-2015 (2013). 
Megaritis, A. G., Fountoukis, C., Charalampidis, P. E., Pilinis, C. & Pandis, S. N. 
Response of fine particulate matter concentrations to changes of emissions and 
temperature in Europe. Atmos. Chem. Phys. 13, 3423-3443 (2013). 


LETTER 


30. Ginoux, P., Prospero, J. M., Gill, T.E., Hsu, N.C. & Zhao, M. Global-scale attribution of 
anthropogenic and natural dust sources and their emission rates based on MODIS 
Deep Blue aerosol products. Rev. Geophys. 50, RG3005 (2012). 

31. Tager, |. et al. Traffic-related Air Pollution: A Critical Review of the Literature on 
Emissions, Exposure, and Health Effects (Health Effects Institute Special Report 17, 
Boston, 2010). 

32. Pozzer, A. et al. Effects of business-as-usual anthropogenic emissions on air 
quality. Atmos. Chem. Phys. 12, 6915-6937 (2012). 


Acknowledgements We are grateful to the EDGAR team of the Joint Research Centre in 
Ispra, Italy, for the emission data. We acknowledge support from the Distinguished 
Scientist Fellowship Program at the King Saud University, Riyadh. The research leading 
to these results has received funding from the European Research Council under the 
European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC grant 
agreement no. 226144. 


Author Contributions J.L., A.P. and M.F. planned the research, A.P. performed the 
model calculations, J.L, A.P., D.G. and J.S.E. analysed the results, and J.L. and J.S.E. 
wrote the paper. All authors contributed to the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to J.L. (jos.lelieveld@mpic.de). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 371 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

Model and emissions. We used the global ECHAM5/MESSy atmospheric chem- 
istry (EMAC) general circulation model at a spatial resolution of T106L31, that is, 
with a spherical spectral truncation of T106, which corresponds to a quadratic 
Gaussian grid of approximately 1.1° X 1.1° latitude X longitude (~110 km at the 
Equator), with 31 vertical hybrid terrain-following and pressure levels up to 10 hPa 
in the lower stratosphere. The core atmospheric model is the 5th generation 
European Centre Hamburg (ECHAMS, version 5.3.01) general circulation model”. 
EMAC includes sub-models that represent tropospheric and stratospheric pro- 
cesses and their interaction with oceans, land and human influences****. It uses 
the Modular Earth Submodel System (MESSy, v.1.09) to link submodels that 
describe emissions, atmospheric chemistry, aerosol and deposition processes; the 
results have been tested against in situ and remote sensing observations” ”’. 

Following up on Lelieveld et al.”, who focused on the year 2005, we present 
results for the years 2010, 2025 and 2050, applying monthly varying emission data 
from Doering et al.*°, also used by Pozzer et al.**. The data are from the Emission 
Database for Global Atmospheric Research (EDGAR), prepared by the Joint 
Research Centre of the European Commission in Ispra (Italy) at a resolution of 
0.1° latitude and longitude***’. For the year 2010 we performed sensitivity calcula- 
tions in which seven main emission categories have been removed one by one to 
compute the impact of these sources and to estimate their contributions to air 
quality control and related mortality. We first calculated the apportionment of 
source categories to the total PM; and O; concentrations and then applied the 
computed fractions to the total mortalities attributable to air pollution. 

The categories are: (1) ‘Natural’ (NAT), mostly desert dust but locally also sea 
salt and dimethyl sulphide derived sulphate, some nitrate and ammonium from 
natural sources, volcanic sulphur emissions and organics released by the vegeta- 
tion; (2) ‘Industry’ (IND), including iron and steel, chemical, pulp and paper, 
food, solvent and other manufacturing sectors, oil refineries and fuel production; 
(3) ‘Land transport’ (TRA), that is, road and non-road transport on land; 
(4) ‘Residential and commercial energy use’ (RCO), referring to local and com- 
mercial energy use from small combustion sources for space heating and cooking, 
including diesel generators and biofuel use; (5) ‘Power generation’ (PG), that is, 
public energy production by fossil fuel fired power plants; 6) ‘Biomass burning’ 
(BB), that is, tropical forest fires and deforestation, savanna and shrub fires, middle 
and high latitude forest and grassland fires, and agricultural waste burning; and 
(7) ‘Agriculture’ (AGR), dominated by ammonia emissions associated with the use 
of fertilizers and domesticated animals. Not included in these categories are air 
traffic and shipping. We find that the removal of individual source categories leads 
to a near-linear response in the modelled contributions to mortality, indicated by 
the small scaling corrections needed (about 10%) to add up to 100% in the country 
level contributions, that is, in Table 2 and Extended Data Table 3. 

The BaU scenarios for 2025 and 2050 assume that energy and food consump- 
tion are largely determined by population growth and economic development, 
which in turn drive air pollution sources based on current legislation and 
technology***’*'. This represents a pessimistic, but plausible future prospect. 
Comparable to Shindell et al’, and different from the Representative 
Concentration Pathways of the Intergovernmental Panel on Climate Change”, 
the BaU scenario differentiates between air pollution and climate change mitiga- 
tion measures, as the latter typically require relatively long-term and structural 
societal changes. The scenarios used here are based on projections for energy and 
fuel computed by the Prospective Outlook for the Long-term Energy System 
(POLES) model*'™ and for agriculture, land-use and waste projections by the 
Integrated Model to Assess the Global Environment (IMAGE)”. 

The population development in the BaU scenario is consistent with our mor- 
tality calculations, as described below, projecting 9 billion people in 2050. For 
additional details we refer to Pozzer et al.” and references therein. While BaU 
projections should not be conceived as ‘predictions’, especially for 2050, they 
represent the current trajectory into the future and may be considered a worst- 
case scenario, to explore what can be expected if air quality policies and health care 
remain as they are today. Note that these results are not sensitive to differential 
toxicity assumptions as the total mortality induced by PM); is not affected, only 
the attribution to source categories. For the future scenarios we used the baseline 
mortalities for 2010. Hence the implicit assumption is that smoking habits, diets 
and health care remain unchanged. 

The model meteorology has been forced by pre-calculated sea surface tempera- 
tures and ice coverage based on a 10-year climatology (2000-2009) adopted from 
the AMIP-II database**”’. The model was applied in atmospheric chemistry-trans- 
port mode by switching the coupling between radiation and atmospheric chem- 
istry off, so that atmospheric composition changes do not influence the model 
dynamics”. This is justified considering that air quality projections are primarily 
driven by emissions rather than climate change**’, even though natural 
sources, biomass burning and deposition processes can be influenced by climatic 


conditions”, For example, Fang et al.” project a 4% climate change effect for 
PM, 5 related mortality and less than 1% for O; related mortality by the end of the 
21st century. 

Although our model resolution does not resolve small-scale heterogeneities in the 
urban environment, a comparison with satellite and ground-based remote sensing 
observations indicates that this is not critical. The exposure response functions used 
to calculate mortalities are based on annual mean concentrations for which these 
heterogeneities largely average out. This is illustrated by Extended Data Fig. 3, which 
compares a simulation for the year 2010 with ground-based AERONET remote 
sensing data of aerosol optical depth (AOD) (http://aeronet.gsfc.nasa.gov). Since 
our model approximates though not replicates meteorological conditions for the 
year 2010, and local flows near the AERONET stations cannot be captured, substan- 
tial scatter around the ideal 1:1 comparison is expected. The comparison shows that 
the model mean error and bias are small (the latter absent for the annual mean), 
and the correlation good. We have also performed a comparison between MODIS 
(satellite) and AERONET data of AOD, leading to similar spread and correlations, 
the latter also increasing through averaging (not shown). 

The primary differences in the relationships between emissions and exposures 
for ground level sources, such as traffic, in comparison with elevated sources, such 
as power plants, have been accounted for in our model"’. The relative impacts of 
secondary particles (such as sulfates and nitrates) from these sources are expected 
to be realistically simulated. On the other hand, models such as ours cannot 
capture the fine structure of near-source gradients in ultrafine PM along trans- 
portation corridors. Because of this our estimates of the relative impacts of urban 
traffic and urban sources of primary fine particles may be biased downward, 
though only to the extent that ultrafine PM is in fact responsible for the mortality 
seen in cohort studies. As discussed above, the relative toxicity of various consti- 
tuents of ambient PM,.; has not been well established. Our sense is that the 
sensitivity study, allowing for carbonaceous particles to be five times as toxic as 
sulfates, nitrates and crustal material, is adequate to cover any potential differences 
in the relationships between emissions, exposure and differential toxicity of traffic 
related PM) 5. 

To investigate if our model reproduces urban concentration increments of PM 5 
and Os, that is, comparing the urban background with the rural environment, we 
compare our results with recent case studies®*’. For Paris and London our model 
computes urban PM, ; increments of 18% and 2%, respectively, consistent with the 
measurements and highly resolved model calculations. Our model calculations sug- 
gest that the leading sources of PM, 5 in Paris are residential energy use, agriculture 
and traffic. Agricultural emissions (NH;/NH,") are transported from the rural 
environment and contribute to PM25 in the city. For London we calculate that 
PM, ; is most strongly influenced by agriculture, traffic and power generation. The 
limited contribution by land traffic and the importance of atmospheric transport for 
air quality in London have been corroborated by observational analysis®. For Beijing 
we calculate an urban PM); increment of 5%, consistent with the conclusion by 
Zhang et al.’ that regional sources are crucial contributors to PM, 5. They estimate 
the contribution by traffic and waste incineration at 4%; our results suggest that traffic 
alone contributes 3% in this city and residential energy use 47%, which we find to be 
representative of China (Table 2). 

Our model calculations indicate that these relatively small urban increments for 
PM, ; are typical for many, though not all, cities. For example, for Johannesburg 
(including Pretoria) we find +41% and for the Pearl River area +62%, and in both 
conurbations residential energy use is the leading source of PM) 5. For O3 we find 
generally small and negative urban increments due to titration of O3 by local traffic 
emissions (in Paris —7% and in London —5%). Negative urban increments due to 
NO by traffic of a few per cent (comparing weekend with weekdays) have also been 
documented for American cities**. For Chicago, New York, Los Angeles and Atlanta 
we find negative O3 increments of 1-5% due to traffic and power generation. 
Sample size. No statistical methods were used to predetermine sample size. 
Exposure response functions. The premature mortality attributable to PM 5 and 
O; has been calculated by applying the EMAC model for the present (2010) and 
projected future (2025, 2050) concentrations. We combined the results with epi- 
demiological exposure response functions by employing the following relationship 
to estimate the excess (that is, premature) mortality: 


AMort=y,[(RR—1)/RR|Pop (1) 


AMort is a function of the baseline mortality rate due to a particular disease 
category y, for countries and/or regions estimated by the World Health 
Organization (the regions and strata are listed in the Extended Data Table 1). 
The term (RR — 1)/RR is the attributable fraction and RR is the relative risk. The 
disease specific baseline mortality rates have been obtained from the WHO Health 
Statistics and Health Information System. The value of RR is calculated for the 
different disease categories attributed to PM,.; and O; for the population below 
5 years of age (ALRI) and 30 years and older (IHD, CEV, COPD, LC) using 


©2015 Macmillan Publishers Limited. All rights reserved 


exposure response functions from the 2010 GBD analysis of the WHO (and 
described below). 

The population (Pop) data for regions, countries and urban areas have been 
obtained from the NASA Socioeconomic Data and Applications Center (SEDAC), 
hosted by the Columbia University Center for International Earth Science 
Information Network (CIESIN), available at a resolution of 2.5’ X 2.5’ (about 
5km X 5 km) (http://sedac.ciesin.columbia.edu/), and projections by the United 
Nations Department of Economic and Social Affairs/Population Division” 
(http://esa.un.org/unpd/wpp). Urban areas are defined by applying a population 
density threshold of 400 individuals per km’, while for megacities and major 
conurbations the threshold is 2,000 individuals per km. We note that the reso- 
lution of our atmospheric model, about 1° latitude/longitude, is coarser than that 
of the population data, and our model does not resolve details of the urban 
environment. However, our anthropogenic emission data are aggregated from a 
resolution of 10 km to that of the model grid, accounting for relevant details such 
as altitude dependence (for example, stack emissions and hot plume rise effects)’. 

Lelieveld et al.*? (henceforth L2013) derived the relative risk RR from the fol- 
lowing exposure response function: 


RR= exp[b(X—X,)] (2) 


The term X represents the model calculated annual mean concentration of PM; or 
O3. The value of X, is the threshold concentration below which no additional risk is 
assumed (concentration-response threshold). The parameter b is the concentration 
response coefficient. However, it has been argued that this expression is based on 
epidemiological cohort studies in the USA and Europe where annual mean PM2 5 
concentrations are typically below 30 jigm °, which may not be representative for 
countries where air pollution levels can be much higher, for example in South and 
East Asia. This is particularly relevant for our BaU scenario. Therefore, here we have 
used the revised exposure response function of Burnett et al.* who also included 
epidemiological data from the exposure to second-hand smoke, indoor air pollution 
and active smoking to account for high PM,5 concentrations, and tested eight 
different expressions. The best fit to the data was found for the following relationship, 
which was also used by Lim et al.* for the GBD for the year 2010: 


RR=1+a{1— exp[—b(X—X,)’] } (3) 


The RR functions were derived by Burnett et al.*. We applied this model for the 
different categories, represented by their figures 1 and 2, shown to be superior to other 
forms previously used in burden assessments. We also adopted the upper and lower 
bounds, likewise shown in these figures, representing the 95% confidence intervals 
(C195). The latter were derived based on Monte Carlo simulations, leading to 
1,000 sets of coefficients and exposure response functions from which the upper 
and lower bounds were calculated. 

Following Burnett et al.* and Lim et al. we combine all aerosol types, hence 
including natural particulates such as desert dust. Note that by using PM, 5 mass, 
we do not distinguish the possibly different toxicity of various kinds of particles. This 
information is not available from epidemiological cohort studies, but could poten- 
tially substantially affect both our overall estimates of mortality and the geographical 
patterns. This is addressed by sensitivity calculations presented in the main text, 
Table 2 and Extended Data Fig. 1. For COPD related to O; we applied the exposure 
response function by Ostro et al.’: 


RR=[(X+1)/(X,+1)]? (4) 


where b is 0.1521 and X, the average of the range 33.3-41.9 p.p.b.v. O3 indicated by 
Lim et al.°, that is, 37.6 p.p.b.v. Previously we used model calculated pre-industrial O3 
concentrations to estimate X — X, (ref. 21), leading to about 20% higher estimates for 
mortality by ‘respiratory disease’ related solely to O03; compared to the current estim- 
ate for COPD due to both PM); and O3. 

For detailed discussion of uncertainties and sensitivity calculations that address 
the shape of exposure response functions, we refer to earlier work**'”? and 
references therein. L2013 estimated statistical uncertainties by propagating the 
quantified (random) errors of all parameters in the exposure response functions. 
They found that the C195 of estimated mortality attributable to air pollution in 
Europe, North and South America, South and East Asia are within 40%, whereas 
they are 100-170% in Africa and the Middle East. Our results are very close to the 
GBD, which substantiates the estimates by Lim et al.* and provides consistency 
with the most recent estimates for 2010, serving as a basis for our investigations. 

We emphasize that the confidence intervals described here, and those reported 
by Lim et al.’, reflect only the statistical uncertainty of the parameters used in the 
concentration-response functions. It is known that the uncertainty in interpreta- 
tion of epidemiological results can be dominated by other model or epistemic 
uncertainties, such as those having to do with the control of confounders. Sources 
of uncertainty have been summarized by Kinney et al.’’, who underscore the need 
to determine the differential toxicity of specific component species within the 


LETTER 


complex mixture of particulate matter. Our sensitivity calculations (Table 2 and 
Extended Data Fig. 1) corroborate that this can have significant influence, espe- 
cially in areas where carbonaceous compounds contribute strongly to PM). 

We emphasize the dearth of studies that link PM2 5 from biomass combustion 
emissions—rich in carbonaceous particles—to IHD. Expert judgment studies on 
the toxicity of particulate matter have reported uncertainties much larger than 
those suggested by analysis of parameter uncertainty alone’*”. Although the C195 
intervals provided above include a larger range of parameters and uncertainties 
than these earlier studies, they should be viewed as lower bounds on the true 
uncertainty in estimates of the health effects of PM,; exposure, especially PM) 5 
from biomass burning and biofuel use. If we consider the possibility that biomass 
burning (BB, including agricultural waste burning) and residential energy use 
(RCO, dominated by biofuel use) do not contribute to mortality by IHD, the total 
mortality attributable to air pollution would decrease from 3.3 to 3.0 million per 
year (Extended Data Table 7). The largest effect is found in Southeast Asia where 
biomass combustion (RCO and BB) is a main source of air pollution. While the 
global contribution by residential energy use, as presented in Table 2, would 
decrease from 31% to 26%, and of biomass burning from 5% to 4% (the other 
categories increase proportionally), the ranking of the different sources and hence 
our conclusions remain unchanged, as RCO and BB would still be the largest and 
smallest source category, respectively. 

Issues such as the shape of the concentration-response functions and the exist- 

ence and specific levels of concentration—-response thresholds have been discussed 
by the experts'®’!”*. These have been accounted for by Burnett et al.*, however, 
uncertainty related to the differences in central estimates given by various 
cohort studies is not reflected in the estimates of parameter uncertainty by Lim 
et al.°. This problem has grown more substantial recently as the results from 
new cohort studies have become available”. Furthermore, uncertainty about 
the relative toxicity of different constituents of PM25 remains. Since the current 
study underscores that the sources of mortality attributable to PM, can differ 
strongly between different regions (Fig. 2), this aspect merits greater attention 
in future. 
Comparison to previous work. We estimate the combined (PM, ; and O; related) 
global mortality attributable to air pollution in 2010 at 3.3 million. Our global 
estimate for PM, 5 related mortality of 3.15 million per year is close to that of 
3.22 million per year in the GBD study for 2010 (ref. 4). However, it is substantially 
higher than the recent multi-model study of Silva et al.” for the year 2000, being 
2.1 million per year. The difference can be explained by the focus of Silva et al.”° on 
anthropogenic pollution in 2000, whereas our study and the GBD account for 
emission increases between 2000 and 2010 and also include natural sources. 

Our global estimate of O3 related mortality by COPD in 2010 is 142,000, 
substantially lower than the estimates of Anenberg et al.'*, 700,000 deaths in 
2000; L2013, 773,000 in 2005; and Silva et al.”°, 470,000 deaths in 2000; but quite 
close to the GBD estimate of 152,000 deaths in 2010. Much of the difference 
between our results (and those from the 2010 GBD) and previous work is 
explained by the fact that we attribute COPD to both O3 and PM, 5. When our 
results for COPD from both O3 and PM, ; are combined, our overall estimate of 
COPD mortality from air pollution agrees with the above-mentioned studies 
within about 25-30%. The remaining differences are largely due to the use of a 
concentration response threshold, X,, in our new work, which substantially 
reduces mortality estimates. Anenberg et al.'* and L2013 did not apply a threshold 
but computed the natural background based on preindustrial emissions. In these 
analyses the calculated ambient concentrations are typically lower than X,. For 
example, the global average O; ambient concentration at the surface in our pre- 
industrial simulation is 19 p.p.b.v. The global mortality estimate for 2010 pre- 
sented here is 10% higher than that of L2013 for 2005. This is primarily due to 
the fact that we also account for natural sources in the present work. If we subtract 
the natural fraction, our estimate of mortality attributable to anthropogenic air 
pollution for 2010 is 9% lower than that of L2013, mostly related to the new 
exposure response functions applied here. 

Our calculations suggest that natural sources contribute relatively strongly to 
mortality attributable to air pollution (18%), about 600,000 per year, which is to a 
large degree caused by airborne desert dust. Recently we reported a global dust- 
related mortality rate of about 400,000 per year, substantially lower than the pre- 
sent estimate”. While here we follow the GBD methodology’, it is likely to yield an 
upper limit. Instead of the annual mean dust concentrations Giannadaki et al.”* 
used the median concentrations, motivated by the intermittent nature of dust 
events. Their sensitivity calculations indicate that had they used the mean con- 
centration instead, their estimate of global dust-related mortality would have 
increased from 402,000 per year to 622,000 per year. Finally, if we assume that 
carbonaceous aerosols are five times more toxic than other compounds, including 
dust particles, the contribution by natural sources would decrease from about 
600,000 per year (18%) to 360,000 per year (11%). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


33. 
34. 


35. 


36. 
37. 


38. 


39. 


50. 


51. 


52. 


53. 
54. 


. Pozzer, A. et al. Technical 


Roeckner, E. et a/. Sensitivity of simulated climate to horizontal and vertical 
resolution in the ECHAM5 atmosphere model. J. Clim. 19, 3771-3791 (2006). 
Jéckel, P. et al. Technical Note: The Modular Earth Submodel System (MESSy) - a 
new approach towards earth system modeling. Atmos. Chem. Phys. 5, 433-444 
(2005). 

Jéckel, P. et al. The atmospheric chemistry general circulation model ECHAM5/ 
MESSy: Consistent simulation of ozone from the surface to the mesosphere. 
Atmos. Chem. Phys. 6, 5067-5104 (2006). 

Pozzer, A., J6ckel, P., Kern, B. & Haak, H. The atmosphere-ocean general circulation 
model EMAC-MPIOM. Geosci. Model Dev. 4, 771-784 (2011). 

Sander, R., Kerkweg, A., Jockel, P. & Lelieveld, J. Technical note: The new 
comprehensive atmospheric chemistry module MECCA. Atmos. Chem. Phys. 5, 
445-450 (2005). 
Kerkweg, A. etal. Technical Note: An implementation of the dry removal processes 
DRY DEPosition and SEDImentation in the Modular Earth Submodel System 
(MESSy). Atmos. Chem. Phys. 6, 4617-4632 (2006). 

Tost, H. et al. Technical note: A new comprehensive SCAVenging submodel for 
global atmospheric chemistry modeling. Atmos. Chem. Phys. 6, 565-574 (2006). 


. Tost, H. et al. Global cloud and precipitation chemistry and wet deposition: 


tropospheric model simulations with ECHAM5/MESSy1. Atmos. Chem. Phys. 7, 
2733-2757 (2007). 
ote: The MESSy-submodel AIRSEA calculating the air- 


sea exchange of chemical species. Atmos. Chem. Phys. 6, 5435-5444 (2006). 


. Pozzer, A. et al. Simulating organic species with the global atmospheric chemistry 


general circulation model ECHAM5/MESSy1: a comparison of model results with 
observations. Atmos. Chem. Phys. 7, 2527-2550 (2007). 


. Pozzer, A., Jéckel, P. & van Aardenne, J. The influence of the vertical distribution of 


emissions on tropospheric chemistry. Atmos. Chem. Phys. 9, 9417-9432 (2009). 


. Pozzer,A. etal. Distributions and regional budgets of aerosols and their precursors 


simulated with the EMAC chemistry-climate model. Atmos. Chem. Phys. 12, 
961-987 (2012). 


. Astitha, M. et al. Parameterization of dust emissions in the global atmospheric 


chemistry-climate model EMAC: impact of nudging and soil properties. Atmos. 
Chem. Phys. 12, 11057-11083 (2012). 


. Pringle, K.J.eta/. Description and evaluation of GMXe: A new aerosol submodel for 


global simulations (v1). Geosci. Model Dev. 3, 391-412 (2010). 


. Pringle, K. J. et al. Global distribution of the effective aerosol hygroscopicity 


parameter for CCN activation. Atmos. Chem. Phys. 10, 5241-5255 (2010). 


. de Meij, A. et al. EMAC model evaluation and analysis of atmospheric aerosol 


properties and distribution. Atmos. Res. 114-115, 38-69 (2012). 


. Christoudias, T. & Lelieveld, J. Modelling the global atmospheric transport and 


deposition of radionuclides from the Fukushima Dai-ichi nuclear accident. Atmos. 
Chem. Phys. 13, 1425-1438 (2013). 

Doering, U., Janssens-Maenhout, G., van Aardenne, J. & Pagliari, V. Climate Change 
and Impact Research in the Mediterranean Environment: Scenarios of Future Climate 
Change. JRC Tech. Note 62957 (Joint Research Centre, Ispra, 2010). 

Van Aardenne, J. et al. Climate and Air Quality Impacts of Combined Climate Change 
and Air Pollution Policy Scenarios. JRC Sci. Tech. Rep. 61281 http://dx.doi.org/ 
10.2788/33719 (Joint Research Centre, Ispra, 2010). 

Shindell, D. et al. Simultaneously mitigating near-term climate change and 
improving human health and food security. Science 335, 183-189 (2012). 
Stocker, T. F, et al. (eds) Climate Change 2013: The Physical Science Basis 
(Cambridge Univ. Press, 2013). 

Russ, P., Wiesenthal, T., van Regenmorter, D. & Ciscar, J. C. Global Climate Policy 
Scenarios for 2030 and Beyond. Analysis of Greenhouse Gas Emission Reduction 
Pathway Scenarios with the POLES and GEM-E3 models. JRC Ref. Rep. EUR 23032 


55. 


56. 


57. 


58. 


59. 


60. 


61. 


62. 


63. 


64. 


65. 


66. 


67. 


68. 


69. 


70. 


71. 


72. 


73. 


EN, http://ipts.jrc.ec.europa.eu/publications/pub.cfm?id= 1510 Joint Research 
Centre, Ispra, 2007). 

Bouwman, A. F., Kram, T. & Klein Goldewijk, K. (eds) Integrated Modelling of Global 
Environmental change. An Overview of IMAGE 2.4 (Netherlands Environmental 
Assessment Agency (MNP), Bilthoven, 2006). 

Taylor, K., Williamson, D. & Zwiers, F. The Sea Surface Temperature and Sea Ice 
Concentration Boundary Conditions for AMIP II Simulations. PCMDI Tech. Rep. 60 
(Program for Climate Model Diagnosis and Intercomparison, Lawrence Livermore 
National Laboratory, Livermore, California, 2000). 

Hurrell, J.eta/. Anewsea surface temperature and sea ice boundary dataset for the 
Community Atmosphere Model. J. Clim. 21, 5145-5153 (2008). 

Jacob, D. J. & Winner, D. A. Effect of climate change on air quality. Atmos. Environ. 
43, 51-63 (2009). 

Pye, H. O. T. et al. Effect of changes in climate and emissions on future sulfate- 
nitrate-ammonium aerosol levels in the United States. J. Geophys. Res. 114, 
DO1205, http://dx.doi.org/10.1029/2008JD010701 (2009). 

Hedegaard, G. B., Christensen, J. H. & Brandt, J. The relative importance of impacts 
from climate change vs. emissions change on air pollution levels in the 21st 
century. Atmos. Chem. Phys. 13, 3569-3585 (2013). 

Naik, V. etal. Preindustrial to present-day changes in tropospheric hydroxyl radical 
and methane lifetime from the Atmospheric Chemistry and Climate Model 
Intercomparison Project (ACCMIP). Atmos. Chem. Phys. 13, 5277-5298 (2013). 
Fang, Y. et al. Impacts of 21st century climate change on global air pollution- 
related premature mortality. Clim. Change 121, 239-253 (2013). 

Jones, A. M., Yin, J. & Harrison, R. M. The weekday-weekend difference and the 
estimation of the non-vehicle contributions to the urban increment of airborne 
particulate matter. Atmos. Environ. 42, 4467-4479 (2008). 

Harrison, R. M., Laxen, D., Moorcroft, S. & Laxen, K. Processes affecting 
concentrations of fine particulate matter (PM2.5) in the UK atmosphere. Atmos. 
Environ. 46, 115-124 (2012). 

Moussiopoulos, N. et a/. An approach for determining urban concentration 
increments. Int. J. Environ. Pollut. 50, 376-385 (2012). 

Timmermans, R. M.A. etal. Quantification of the urban air pollution increment and 
its dependency on the use of down-scaled and bottom-up city emission 
inventories. Urban Clim. 6, 44-62 (2013). 

Zhang, R. et al. Chemical characterization and source apportionment of PM2.5 in 
Beijing: seasonal perspective. Atmos. Chem. Phys. 13, 7053-7074 (2013); Atmos. 
Chem. Phys. 14, 175 (2014). 

Blanchard, C. L, Tanenbaum, S. & Lawson, D. R. Differences between weekday and 
weekend air pollutant levels in Atlanta; Baltimore; Chicago; Dallas-Fort Worth; 
Denver; Houston; New York; Phoenix; Washington, DC; and surrounding areas. 
J. Air Waste Manag. Assoc. 58, 1598-1615 (2008). 

World Health Organization. World Health Organization Statistical Information System 
(WHOSIS), Detailed Data Files of the WHO Mortality Database http://www.who.int/ 
healthinfo/statistics/mortality_rawdata/en/ (WHO, Geneva, 2012). 

United Nations Department of Economic and Social Affairs/Population Division. 
World Population Prospects: the 2004 Revision. E.05.XIll.12 (United Nations, 2005). 
Kinney, P. L. etal. On the use of expert judgment to characterize uncertainties in the 
health benefits of regulatory controls of particulate matter. Environ. Sci. Policy 13, 
434-443 (2010). 

Roman, H. A. etal. Expert judgment assessment of the mortality impact of changes 
in ambient fine particulate matter in the U.S. Environ. Sci. Technol. 42, 2268-2274 
(2008). 

Cao, J. et al. Association between long-term exposure to outdoor air pollution and 
mortality in China: A cohort study. J. Hazard. Mater. 186, 1594-1600 (2011). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


60°N 
30°N 
i) 
To 
Bae 
50 
oe) 
a 
30°S 
60°S 
100°W 0° 100°E 
Longitude 
IND TRA RCO BB PG AGR NAT 
Extended Data Figure 1 | Source categories responsible for the largest impact than inorganic and crustal compounds. IND, industry; TRA, land 
traffic; RCO, residential energy use (for example, heating, cooking); BB, 


impact on mortality linked to outdoor air pollution in 2010 from a 


sensitivity calculation with carbonaceous aerosol having a five times larger _ biomass burning; PG, power generation; AGR, agriculture; and NAT, natural. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


60°N 
30°N 
() 
To 
=) 
Bo 
© 
— 
30°S 4 
60°S 
100°W 0° 100°E 
Longitude 


Extended Data Figure 2 | Increase in mortality linked to outdoor air pollution from 2010 to 2050 (business-as-usual scenario). Units (colour coded), deaths 
per area of 100km X 100km. In the white areas, no additional mortality is projected. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


10! | 1 iit 1 sit 1 1 aes 1 “; —y- 40! | ni 10! | 
| + EMAC “ Peak ] ] 
; ee ot ] 
10° so (10! 10° 4 
oa o | oa 
8 107 a -t 8 1014 8 10° 4 
= | = : = : 
10? t 107 4 L 10? 4” 
et E 4 RMSE 0.33 L J RMSE 0.26 
“#4 .CORR 0.53 | 4 CORR 0.53 L CORR 0.63 
a Se MBE -0.02 MBE 0.01 ” MBE -0.00 
$+ y-STD 0.45 y-STD 0.31 I 4° P <i y-STD 0.26 
ao oe “i xSTD 0.42 x-STD 0.36 - o ae x-STD 0.33 
10° (oF sas SL aL SL ee 109 oo oop oot 
10° 10? 10° 10° 10' 107 10° 10' 10° 10? 10" 10° 10' 
AERONET AERONET AERONET 


Extended Data Figure 3 | Comparison of EMAC model calculated aerosol 
optical depth (AOD) with AERONET observations, using all available 
measurements worldwide in the year 2010. Although the comparison with 
individual data points shows a large scatter (left panel), the bias is 

small (MBE), and time averaging improves the agreement. The middle panel 
shows a comparison of the monthly means, and the right panel the annual 


means (that is, showing individual stations) for which the mean error (root 
mean square error, RMSE) is smallest, the correlation highest and the bias 
absent. The long-dashed line indicates absolute agreement, the bold short- 
dashed lines agreement within a factor of two and the short-dashed lines 
agreement within a factor of ten. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | WHO regions, mortality strata, child and adult mortality characteristics, and the countries and territories included 


Region Stratum Child Adult Countries and territories within stratum 
mortality mortality 
Afr-D High High Algeria, Angola, Benin, Burkina Faso, Cameroon, Cape Verde, Chad, 
Comoros, Equatorial Guinea, Gabon, Gambia, Ghana, Guinea, Guinea- 
Africa Bissau, Liberia, Madagascar, Mali, Mauritania, Mauritius, Mayotte, 


Niger, Nigeria, Reunion, Saint Helena, Sao Tome and Principe, Senegal, 
Seychelles, Sierra Leone, Togo 


Afr-E High Very high Botswana, Burundi, Central African Republic, Congo, Céte d’Ivoire, 
Democratic Republic of the Congo, Eritrea, Ethiopia, Kenya, Lesotho, 
Malawi, Mozambique, Namibia, Rwanda, South Africa, Swaziland, 
Uganda, United Republic of Tanzania, Zambia, Zimbabwe 


Amr-A Very low Very low Canada, Cuba, Greenland, Saint Pierre and Miquelon, United States of 
America 
Americas Amr-B Low Low Anguilla, Antigua and Barbuda, Argentina, Aruba, Bahamas, Barbados, 


Belize, Bermuda, Brazil, British Virgin Islands, Cayman Islands, Chile, 
Colombia, Costa Rica, Dominica, Dominican Republic, El Salvador, 
Falkland Islands, French Guiana, Grenada, Guadeloupe, Guyana, 
Honduras, Jamaica, Martinique, Mexico, Montserrat, Netherlands 
Antilles, Panama, Paraguay, Puerto Rico, Saint Kitts and Nevis, Saint 
Lucia, Saint Vincent and the Grenadines, Suriname, Trinidad and 
Tobago, Turks and Caicos Islands, United States Virgin Islands, 
Uruguay, Bolivarian Republic of Venezuela 


Amr-D High High Bolivia, Ecuador, Guatemala, Haiti, Nicaragua, Peru 
Southeast Asia Sear-B Low Low Indonesia, Sri Lanka, Thailand 
Sear-D High High Bangladesh, Bhutan, Democratic People’s Republic of Korea, East 


Timor, India, Maldives, Myanmar, Nepal 


Eur-A Very low Very low Andorra, Austria, Belgium, Croatia, Cyprus, Czech Republic, Denmark, 
Faeroe Islands, Finland, France, Germany, Gibraltar, Greece, Guernsey, 
Iceland, Ireland, Isle of Man, Israel, Italy, Jersey, Liechtenstein, 
Luxembourg, Malta, Monaco, Netherlands, Norway, Portugal, San 
Marino, Slovenia, Spain, Svalbard, Sweden, Switzerland, United 
Kingdom 


Europe 


Eur-B Low Low Albania, Armenia, Azerbaijan, Bosnia and Herzegovina, Bulgaria, 
Georgia, Kyrgyzstan, Poland, Romania, Serbia and Montenegro, 
Slovakia, Tajikistan, The former Yugoslav Republic of Macedonia, 
Turkey, Turkmenistan, Uzbekistan 


Eur-C Low High Belarus, Estonia, Hungary, Kazakhstan, Latvia, Lithuania, Republic of 
Moldova, Russia, Ukraine 


Eastern Emr-B Low Low Bahrain, Iran, Jordan, Kuwait, Lebanon, Libyan Arab Jamahiriya, Oman, 
Mediterranean Qatar, Saudi Arabia, Syrian Arab Republic, Tunisia, United Arab 
Emirates 


Emr-D High High Afghanistan, Djibouti, Egypt, Iraq, Morocco, Palestinian Territories, 
Pakistan, Somalia, Sudan, Yemen 


Wpr-A Very low Very low Australia, Brunei Darussalam, Japan, New Zealand, Singapore 


Western Pacific Wpr-B Low Low Cambodia, China, Cook Islands, Fiji, French Polynesia, Guam, Hong 
Kong, Kiribati, Lao People’s Democratic Republic, Macao, Malaysia, 
Marshall Islands, Pitcairn, Fed. States of Micronesia, Mongolia, Nauru, 
New Caledonia, Niue, Norfolk Island, Northern Mariana Islands, Palau, 
Papua New Guinea, Philippines, Republic of Korea, Samoa, Solomon 
Islands, Taiwan, Tokelau, Tonga, Tuvalu, Vanuatu, Vietnam, Wallis and 
Futuna 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 2 | Premature mortality related to PM2.5 and O3 in 2010 


LETTER 


Mortality attributable to air pollution 


(deaths x103) 


PM25 03 Total 
Strata Population ALRI IHD CEV COPD LC COPD 
(«10°) <Syr 230yr 230yr 230yr 230yr 230yr 
Africa Afr-D 379 77 37 62 9 1 2 188 
Afr-E 430 13 17 15 3 1 49 
Amr-A 352 0 38 6 4 6 3 57 
Americas Amr-B 493 0 5 1 0 1 1 8 
Amr-D 85 0 0 0 0 0 0 1 
Eastern Emr-B 165 2 34 20 2 1 3 62 
ee een 437 54 80 67 11 3 9 224 
Eur-A 410 0 73 33 7 16 3 132 
Europe Eur-B 229 1 57 34 3 7 2 104 
Eur-C 228 0 110 28 2 4 0 144 
Southeast Asia Sear-B 324 3 37 23 7 3 2 75 
Sear-D 1,438 61 290 227 117 12 80 787 
Western Pacific Wpr-A 156 0 11 9 0 5 2 27 
Wpr-B 1,656 19 288 784 209 102 33 1,435 
World 6,783 230 1,079 1,311 374 161 142 3,297 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 3 | Premature mortality by PM2.5 and O3 related diseases in 2010 in countries where it exceeds 9,000 individuals 
per year (<5 and =30 years old) 


Country Deaths in Natural Industry Land Residential Power Biomass Agriculture 
2010 traffic energy generation burning 
China 1,357,353 118,954 106,754 44,751 435,763 237,324 18,414 395,390 
India 644,993 74,145 42,336 30,070 325,604 89,130 42,163 41,541 
Pakistan 110,571 63,147 2,478 3,389 34,707 2,761 2,108 1,977 
Bangladesh 91,923 0 6,117 5,656 50,382 13,697 6,418 9,652 
Nigeria 89,022 68,479 176 85 12,006 258 7,554 462 
Russia 67,152 630 5,193 7,731 4,885 14,606 5,477 28,628 
USA 54,905 1,290 3,297 11,435 3,192 16,929 2,537 16,221 
Indonesia 52,417 71 1,814 1,244 31,498 2,379 14,338 1,070 
Ukraine 51,238 55 4,632 5,188 3,011 9,459 2,326 26,563 
Vietnam 44,097 0 3,627 1,686 22,575 5,486 5,378 5,343 
Egypt 35,322 32,651 210 450 190 816 61 941 
Germany 34,422 0 4,452 6,928 2,684 4,402 279 15,675 
Turkey 31,943 4,912 3,414 3,487 2,812 6,194 1,851 9,269 
Iran 26,108 21,175 662 969 311 1,101 230 1,656 
Japan 25,516 0 4,567 2,526 3,046 4,458 1,154 9,763 
Sudan 24,255 22,249 59 47 200 133 1,488 77 
Myanmar 22,537 10 1,082 842 8,287 2,662 8,707 944 
Italy 20,809 1,251 2,930 3,519 1,454 3,192 376 8,085 
Iraq 20,335 18,513 209 390 109 510 91 510 
Thailand 19,843 0 2,211 1,469 5,207 2,944 6,529 1,481 
France 17,800 0 2,515 3,152 2,468 2,113 211 7,339 
Dem. Rep. Korea (N) 16,783 0 1,996 770 3,445 3,467 715 6,386 
United Kingdom 15,488 0 1,627 3,091 854 2,412 63 7,438 
Algeria 14,954 11,262 1,113 656 194 773 107 847 
Dem. Republic Congo 14,880 901 193 45 1,405 121 12,119 92 
Romania 14,633 0 1,336 1,825 1,403 3,225 573 6,270 
Saudi Arabia 14,600 13,708 165 165 46 308 38 167 
Poland 14,561 0 1,451 1,886 2273 2,265 372 6,310 
Korea (S) 14,352 0 2,045 1,108 2,049 2,803 321 6,024 
Morocco (+W. Sahara) 14,217 10,929 966 346 224 856 99 795 
Niger 13,061 12,893 9 0 89 16 32 19 
Uzbekistan 11,598 7,341 671 590 578 623 265 1,526 
Nepal 10,926 56 641 510 7,481 1,090 681 465 
Mali 9,444 9,060 1 0 101 0 273 6 
Ghana 9,317 5,552 105 68 525 68 23922 75 
Burkina Faso 9,295 8,851 3 0 127 20 256 35 
World 3,297,370 596,895 226,137 163,852 1,002,370 464,748 179,268 664,100 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 4 | Premature mortality related to PM2z.5 and O3 in 2025 


Mortality attributable to air pollution 
(deaths x103) 


PM25 03 Total 
Strata Population ALRI IHD CEV COPD LC COPD 
(«10°) <Syr 230yr 230yr 230yr 230yr 230yr 
Africa Afr-D 538 102 60 99 14 2 5 282 
Afr-E 597 17 30 27 4 1 2 81 
Amr-A 395 0 48 8 5 7 6 74 
Americas Amr-B 561 0 9 3 1 1 3 17 
Amr-D 105 0 1 0 0 0 0 1 
Eastern Emr-B 197 2 55 33 3 2 5 100 
eee “nie 579 61 133 109 17 6 19 345 
Eur-A 423 0 79 37 8 17 5 146 
Europe Eur-B 246 1 74 46 4 9 4 138 
Eur-C 221 0 123 34 3 5 0 165 
Southeast Asia Sear-B 362 3 54 38 11 5 7 118 
Sear-D 1,697 76 461 398 201 21 155 1,312 
Western Pacific Wpr-A 158 0 11 11 1 5 4 32 
Wpr-B 1,760 18 377 1,038 284 138 51 1,906 
World 7,838 280 1,515 1,881 556 219 266 4,717 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 5 


Premature mortality related to PM2.5 and O3 in 2050 


Mortality attributable to air pollution 


(deaths x103) 


PM25 03 Total 
Strata Population ALRI IHD CEV COPD LC COPD 
(x10°) <Syr 230yr 230yr 230yr 230yr 230yr 
Africa Afr-D 874 137 121 203 28 4 8 501 
Afr-E 933 21 64 58 10 1 4 158 
Amr-A 451 0 59 10 6 10 7 92 
Americas Amr-B 609 0 15 5 1 1 4 26 
Amr-D 131 0 1 0 0 0 0 1 
Eastern Emr-B 222 2 80 50 5 3 7 147 
eeceees ee 799 64 241 196 32 10 33 576 
Eur-A 431 85 43 9 20 5 162 
Europe Eur-B 252 1 95 68 6 12 5 187 
Eur-C 203 0 127 45 3 5 1 181 
Southeast Asia Sear-B 382 3 69 51 14 7 8 152 
Sear-D 1,950 101 796 755 405 41 219 2,317 
Western Pacific Wpr-A 149 0 10 9 0 4 4 27 
Wpr-B 1,712 16 403 1,110 309 151 53 2,042 
World 9,098 346 2,166 2,604 828 270 358 6,572 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 6 | Population and premature mortality (deaths per year) related to PM2.5 and O3 in the most polluted megacities and 
conurbations in 2010, 2025 and 2050 


2010 2025 2050 
Megacity Population Deaths Population Deaths Population Deaths 
(106) (x103) (106) (103) (x10°) (103) 
London 8.1 2.8 9.1 3.4 10.2 4.2 
Paris 8.4 3.1 9.2 3.8 10.2 4.6 
Moscow 14.9 8.6 14.4 10.8 13.1 11.7 
Po Valley 3.4 1.3 3.4 1.4 3.2 1.4 
Istanbul 11.1 5.6 13.2 8.5 14.5 13.2 
Teheran 9.7 2.9 11.1 4.8 11.4 6.9 
Cairo 12.5 6.0 15.9 8.2 19.8 11.4 
Lagos 8.3 3.7 12.7 6.0 22.0 11.2 
Johannesburg! 6.9 1.5 7.7 2.3 8.6 3.8 
Karachi 11.9 7.3 15.4 11.4 19.4 17.9 
Mumbai? 18.0 10.2 22.1 17.4 26.8 33.1 
Delhi 22.5 19.7 27.8 31.1 33.3 52.0 
Kolkata 20.3 13.5 28.4 26.6 38.8 54.8 
Dhaka 22.8 13.1 31.2 26.4 38.2 49.9 
Jakarta 22.5 10.4 26.1 16.4 29.0 22.1 
Chengdu 6.2 7.4 6.4 9.5 5.9 9.7 
Beijing 10.8 13.7 11.3 17.3 10.4 17.7 
Tianjin 3.7 4.9 3.9 6.2 3.6 6.3 
Shanghai 14.1 14.9 14.3 18.9 13.2 19.4 
Seoul 20.8 6.6 21.7 8.5 20.3 8.7 
Tokyo 29.2 6.0 28.1 6.4 24.2 5.4 
Osaka 13.5 2.8 12.8 3.1 10.9 2.6 
Hong Kong 6.9 2.6 7.6 3.7 8.8 4.4 
Pearl River area 53.1 49.2 56.0 65.2 52.9 67.4 
Manila 19.8 0.6 26.5 2.3 37.3 4.5 
Bangkok 8.8 3.1 9.5 4.9 9.2 5.7 
New York 12.5 3.2 14.5 4.2 17.5 5.2 
Los Angeles 12.2 4.1 14.6 5.2 17.7 7.0 
Mexico City 10.7 1.6 12.3 3.3 13.9 5.3 


1 Includes Pretoria 


2 Includes east suburb 


The names of the megacities cities have been colour coded according to the WHO regions: Europe, black font; Eastern Mediterranean, blue; Africa, red; Southeast Asia, green; Western Pacific, brown; Americas, 


purple. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 7 | Premature mortality related to PM2 5 and O3 for the population aged <5 years and =30 years 


Mortality attributable to air pollution 
(deaths x103) 


WHO region Year aa PMs 0; Total 
ALRI IHD CEV COPD LC COPD 
<5 yr 230yr 230yr 230yr 230yr 230yr 
Africa 2010 809 90 55 77 11 2 2 237 
2010* 37 219 
Americas 2010 930 0 44 8 4 7 5 68 
2010* 35 59 
Eastern 2010 602 56 115 86 12 5 12 286 
Mediterranean 2010* 104 275 
Europe 2010 867 1 239 95 13 27 6 381 
2010* 213 355 
Southeast Asia 2010 1,762 64 327 250 124 15 82 862 
2010* 169 704 
Western Pacific 2010 1,812 19 299 794 209 107 35 1,463 
2010* 250 1,414 
World 2010 6,783 230 1,079 1,311 374 161 142 3,297 
2010* 808 3,026 


*|n these rows, IHD mortality related to residential energy use (RCO) and biomass burning has been excluded. 


©2015 Macmillan Publishers Limited. All rights reserved 


od ess 


doi:10.1038/nature15256 


Non-adaptive plasticity potentiates rapid adaptive 
evolution of gene expression in nature 


Cameron K. Ghalambor'”, Kim L. Hoke’?, Emily W. Ruell', Eva K. Fischer', David N. Reznick* & Kimberly A. Hughes* 


Phenotypic plasticity is the capacity for an individual genotype 
to produce different phenotypes in response to environmental 
variation’. Most traits are plastic, but the degree to which plasticity 
is adaptive or non-adaptive depends on whether environmentally 
induced phenotypes are closer or further away from the local 
optimum’ ~*. Existing theories make conflicting predictions about 
whether plasticity constrains or facilitates adaptive evolution*”’. 
Debate persists because few empirical studies have tested the rela- 
tionship between initial plasticity and subsequent adaptive evolu- 
tion in natural populations. Here we show that the direction of 
plasticity in gene expression is generally opposite to the direction 
of adaptive evolution. We experimentally transplanted Trinidadian 
guppies (Poecilia reticulata) adapted to living with cichlid predators 
to cichlid-free streams, and tested for evolutionary divergence in 
brain gene expression patterns after three to four generations. We 
find 135 transcripts that evolved parallel changes in expression 
within the replicated introduction populations. These changes are 
in the same direction exhibited in a native cichlid-free population, 
suggesting rapid adaptive evolution. We find 89% of these tran- 
scripts exhibited non-adaptive plastic changes in expression when 
the source population was reared in the absence of predators, as they 
are in the opposite direction to the evolved changes. By contrast, the 
remaining transcripts exhibiting adaptive plasticity show reduced 
population divergence. Furthermore, the most plastic transcripts in 
the source population evolved reduced plasticity in the introduction 
populations, suggesting strong selection against non-adaptive 
plasticity. These results support models predicting that adaptive 
plasticity constrains evolution’ *, whereas non-adaptive plasticity 
potentiates evolution by increasing the strength of directional 
selection’’’”. The role of non-adaptive plasticity in evolution has 
received relatively little attention; however, our results suggest that 
it may be an important mechanism that predicts evolutionary 
responses to new environments. 

A long-standing problem in evolutionary biology is to understand 
the relationship between environmentally induced variation observed 
within a generation, and genetically-based evolutionary changes 
between generations’ ®. It has long been recognized that the expression 
of traits is plastic—the same genotype can produce a range of pheno- 
types in response to different environmental cues. However, the causal 
relationship between a trait’s plasticity and that trait’s evolution 
remains an unresolved and contentious problem’. Traditional models 
of adaptive evolution ignored any role for plasticity, because environ- 
mentally induced plasticity was viewed as non-heritable variation’”. 
Current models recognize that environments can cause predictable 
patterns of plasticity that are either adaptive or non-adaptive with 
respect to the local phenotypic optimum; such plasticity may influence 
evolutionary change by altering the distribution of phenotypes upon 
which selection acts. For example, plasticity is adaptive when the 
phenotype is altered in the same direction favoured by natural selec- 
tion in that environment*’*. Some models predict that adaptive plas- 
ticity weakens the strength of directional selection and slows adaptive 


evolution® *”*, Other models suggest that adaptive plasticity is a critical 
first step in the process of adaptive evolution (for example, via genetic 
assimilation or accommodation)’, for instance by increasing popu- 
lation persistence in new environments (the Baldwin effect) and allow- 
ing more time for selection to act on heritable variation* ’°. In contrast, 
plasticity is non-adaptive when a population encounters an envir- 
onment that induces the production of phenotypes further away from 
the local optimum*”, resulting in a negative relationship between the 
direction of plasticity and the direction of adaptive evolution. Non- 
adaptive plasticity reduces relative fitness and is predicted to increase 
the strength of directional selection because traits are further from the 
phenotypic optimum, resulting in an evolutionary response some- 
times referred to as ‘genetic compensation’ or “counter-gradient vari- 
ation’. Laboratory selection experiments have found support for a 
positive (adaptive)'*** and negative (non-adaptive)’® relationship 
between the direction of plastic responses and the direction of evolu- 
tion. However, testing such relationships in natural populations has 
been challenging because comparisons between ancestral and derived 
populations typically occur long after the populations have diverged’”°. 
Here, we test the relationship between plasticity and the early stages of 
evolutionary divergence using experiments in nature. We assess both 
ancestral plasticity in the source population and evolved changes in 
replicated derived populations by comparing plastic and evolved pat- 
terns of gene expression. 

We quantified gene expression in Trinidadian guppies derived 
from natural populations and from populations undergoing early 
divergence following an experimental translocation. Individuals 
from a population that experiences high mortality from fish pre- 
dators (high-predation, denoted as HP), particularly the pike cichlid 
(Crenicichla frenata), were introduced into each of two low-predation 
sites lacking cichlids: ‘Introl’ and ‘Intro2’ (Extended Data Fig. 1). 
Thirty-eight gravid females and 38 mature males were introduced into 
each stream. One year after the introduction (3-4 guppy generations), 
guppies were collected from the ancestral HP source population, des- 
cendant introduction populations (Introl and Intro2), and a naturally 
colonized low-predation guppy population (denoted as LP) from the 
same drainage (Methods). The natural LP population represents an 
older evolutionary descendant of the HP source population” adapted 
to the same predation regime as the experimental populations. It thus 
provides an a priori prediction for the expected direction of evolution- 
ary change. 

To assess plastic and evolved changes in transcription, we bred wild- 
caught fish under common laboratory conditions for two generations 
and generated unique family lines within each of the four populations. 
Two generations of rearing in a common environment controls for 
environmental, maternal and other non-heritable sources of variation. 
Within 24h of birth, second generation full-siblings of each family 
were randomly split between tanks that differed in exposure to chem- 
ical predator cues. Siblings reared with predator cues were raised in 
recirculating units that housed a cichlid within the water supply”. 
Cichlids were fed two guppies per day. Predator cues included both 


1Department of Biology, Colorado State University, Fort Collins, Colorado 80523, USA. *Graduate Degree Program in Ecology, Colorado State University, Fort Collins, Colorado 80523, USA. Department of 
Biology, University of California, Riverside, California 92521, USA. *Department of Biological Science, Florida State University, Tallahassee, Florida 32306-4295, USA. 


372 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


predator kairomones from the cichlid as well as any alarm pheromones 
from guppies, simulating the ancestral olfactory environment™. 
Guppies reared without predator cues were housed in identical recir- 
culating units without the cichlid predator, simulating the derived 
environment. Differences in transcription between siblings reared 
in these two environments represent predator-induced plasticity in 
gene expression, while differences between populations measured 
under the same conditions for multiple generations represent heritable 
differences”’. 

To determine whether the introduction populations showed evid- 
ence for adaptive evolutionary divergence, we measured patterns of 
transcription in all four populations under the derived rearing envir- 
onment. We measured the abundance of 37,493 messenger RNA tran- 
scripts expressed in whole brains of mature males reared without 
predator cues (mean age = 124.03 days old, range = 118-154), using 
high-throughput RNA sequencing. We used multivariate between- 
group principal components analysis (Methods) to visualize overall 
transcription differences among the four populations (Fig. 1). Two 
major axes explained 74.5% of the variation. Principal component 1 
(PC1; 44.4% of variation) separated the naturally occurring LP popu- 
lation from the natural HP and introduction populations, and thus 
appears to reflect long-term divergence between these populations. 
PC2 (30.1% of variation) separated the HP source population from 
the two introduction populations and the natural LP population, thus 
capturing a signal of rapid and parallel evolutionary divergence to the 
LP environment (Fig. 1). Whereas genetic drift, founder effects, and 
unique attributes of each of the introduction streams would be 
expected to produce independent genetic changes in the introduction 
populations”, the parallel change of Introl and Intro2 in the same 
direction as the natural LP population supports the interpretation that 
PC2 describes rapid adaptive evolution. Indeed, the rate of evolution- 
ary divergence in gene expression between the source population and 


PC2 


PC1 


Figure 1 | Rapid evolutionary divergence in gene expression as measured in 
second-generation laboratory-born guppies derived from the wild. Shown is 
a principal components analysis of all 37,493 expressed genes in the four 
populations. HP is a naturally occurring high-predation population that is the 
source population for the two experimentally introduced populations, Intro1 
and Intro2. LP is a naturally occurring low-predation population. Points 
represent individual families within each population, and are connected by 
solid lines. Dashed lines represent the major and minor axes of the confidence 
ellipse for each population. 


LETTER 


introduction populations for the top 500 transcripts loading on PC2 
(median Haldanes (a change in phenotypic standard deviations per 
generation) in Introl = 0.256 and Intro2 = 0.226) are comparable to 
rapid rates of evolution observed in life history and morphology dur- 
ing previous experimental introductions of guppies”*”* (Methods and 
Extended Data Fig. 2a, b). 

To distinguish transcripts that exhibited evolution in the introduc- 
tion populations as a result of selection from those that exhibited 
changes as a result of other processes, we identified transcripts that 
exhibited highly significant parallel evolutionary change in both intro- 
duction populations and that diverged in the same direction in the 
natural LP population. Permuted data sets (n = 250) were generated 
by randomly reassigning population labels to individual samples. We 
then used general linear statistical models to assess divergence in the 
two introduction populations and the natural LP population (that is, 
HP versus Introl and HP versus Intro2 and HP versus LP) for each 
transcript (Methods). If the test statistic for each of the three contrasts 
fell in the extreme 5% of the distribution of the permutation test 
statistics, and the contrasts all had the same sign, we called the tran- 
script concordantly differentially expressed (CDE). We found 135 
transcripts that met these stringent criteria, which was many more 
than observed in the permuted data sets (median = 6, interquartile 
range = 3-14; Methods and Supplementary Table 1). By contrast, only 
one transcript diverged significantly in opposite directions in the two 
descendant introduction populations, consistent with expectations based 
on the distribution of permuted values (median = 1, interquartile 
range = 1-2). These 135 CDE transcripts loaded highly on PC2 (the 
median rank of the PC2 loadings for the CDE transcripts was 361 out 
of 37,493 total transcripts). The prevalence of these parallel changes 
suggests that this subset of transcripts evolved through the direct or 
indirect effects of natural selection, because genetic drift would have 
produced discordant as well as concordant evolution in the descend- 
ant introduction populations. Indeed, divergence in transcription 
between the ancestral and introduction populations greatly exceeded 
allele frequency divergence in putatively neutral microsatellite loci** 
(Extended Data Table 1). Collectively, these results demonstrate rapid 
and repeatable patterns of adaptive evolutionary divergence in tran- 
scription, similar to what has been observed for other fitness-related 
guppy traits following the colonization of low-predation environ- 
ments?22°8, 

Given the evidence for rapid evolution of transcription, we deter- 
mined if the pattern of ancestral plasticity in the HP source population 
predicted adaptive evolution in the descendant introduction popula- 
tions. We assessed plasticity in the HP population by measuring the 
change in transcript abundance of full siblings reared with and without 
the predator cue (that is, simulating the ancestral high-predation and 
derived low-predation environments). If plasticity in transcript 
abundance was in the same direction as the parallel divergence 
observed in CDE transcripts, we considered plasticity to be adaptive. 
If the plastic changes were in the opposite direction as the evolved 
changes in CDE transcripts, we considered the plasticity to be non- 
adaptive (see Extended Data Fig. 3). We found a robust pattern of 
non-adaptive plasticity predicting evolutionary change in CDE tran- 
scripts; when HP fish were reared without the predator cue, the change 
in transcript abundance was overwhelmingly in the opposite direction 
to that of evolved changes in the descendant introduction populations 
(Fig. 2). The negative association between the direction of plasticity 
and the direction of evolution was highly significant (y* = 89.9, 
d.f.= 1), which is outside the range of all 250 permuted ¢ values 
(range = 0.0-55.9), with 89% (120 of 135) of all transcripts exhibiting 
a plastic response opposite to the direction of evolution (see grey points 
in Fig. 2). Of the remaining 11% (15 of 135) of transcripts, when the 
direction of plasticity and evolution aligned, the degree of plasticity 
was negligible (see black points in Fig. 2). The correlation between 
ancestral plasticity and evolution (r = —0.82) is substantially more 
negative than correlations generated from a randomization test 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 373 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


oO 80 


60 


a 
[o} 
Count 


40 
20 


10) 
-0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 
Correlation coefficients 


Evolved divergence in gene expression 
(log-transformed count per million reads) 


-2.0 T 
-0.6 


T T T T 
-0.4 -0.2 0.0 0.2 0.4 0.6 
Ancestral plasticity in gene expression 
(log-transformed count per million reads) 


Figure 2 | Rapid evolutionary divergence is highly correlated with non- 
adaptive plasticity. Shown is a scatter plot of ancestral plasticity (change in 
transcript abundance to the absence of cichlid predator cues) against adaptive 
evolutionary divergence (135 concordantly differentially expressed transcripts) 
in the descendent populations transplanted to streams lacking cichlid 
predators. Grey points denote transcripts exhibiting non-adaptive plasticity, 
and black points denote adaptive plasticity. Inset shows the distribution of the 
Spearman rank correlations between evolutionary divergence and ancestral 
plasticity from 1,000 permutated correlation values for the 135 concordantly 
differentially expressed transcripts, with the arrow indicating the observed 
correlation, which is substantially more negative than all permuted values. 


P<0.0001; Fig. 2). These results suggest that plasticity potentiates 
rapid adaptive evolution, but not because plasticity is adaptive, as is 
assumed in many evolutionary models, but rather because it is non- 
adaptive and under stronger selection to change (Fig. 2). The same 
pattern is observed when we restrict the analysis to a separate data set 
that included 565 transcripts exhibiting significant plasticity to the 
rearing treatments in the HP source population (Supplementary 
Table 2 and Extended Data Fig. 4) 

The magnitude of plasticity can also evolve in response to selec- 
tion’"°. Ifnatural selection acts most strongly on transcripts exhibiting 
non-adaptive plasticity, we predicted plasticity should evolve to be 
reduced in the descendant introduction populations. We tested this 
prediction by comparing plasticity in the ancestral source population 
to that in the derived introduction populations for the subset of 
transcripts that were CDE. The magnitude of plasticity decreased 
in the introduction populations (median change = —11%, sign test 
M = —45.5, P<0.001). Moreover, the decline in plasticity in these 
descendant populations was negatively associated with the magnitude 
of ancestral plasticity (P < 0.001 based on a randomization test; Fig. 3), 
in accord with the idea that selection acts more strongly to decrease 
plasticity in those transcripts showing the greatest non-adaptive plas- 
ticity. Thus, traits exhibiting initially non-adaptive plastic responses to 
new environments may be a transient phenomenon, because selection 
may act to rapidly reduce their magnitude. 

Attempts to model the effects of plasticity on subsequent adaptive 
evolution often assume that plasticity is adaptive. However, when 
populations experience novel environments, as when we experiment- 
ally transplanted guppies, many of the initial plastic responses are 
likely to be non-adaptive, because selection has not had an opportunity 
to act on the genetic variation for plasticity” *. In such cases, both 
adaptive and non-adaptive plastic responses would be expected by 
chance, but traits exhibiting adaptive plasticity should be under weaker 
directional selection relative to traits exhibiting non-adaptive plasticity 
and further from the new phenotypic optimum’’”. Indeed, both 
theoretical and empirical studies show that adaptive plasticity reduces 
directional selection”’*’’. While we were unable to directly estimate 
the strength of selection on transcript abundance phenotypes, 


374 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


0.64 100 
e2 ~ 80 

3 60 

0.44 2 40 
Oo 20 


0 
-0.6 -0.5 -0.4-0.3-0.2-0.1 0.0 
Correlation coefficients 


within introduction populations 


Evolved mean change in plasticity 
(log-transformed count per million reads) 


0.0 0.1 0.2 0.3 0.4 05 


Magnitude of plasticity in the source population 
(log-transformed count per million reads) 


Figure 3 | Rapid evolution of reduced plasticity. Shown isa scatter plot of the 
absolute values for the magnitude of ancestral plasticity (the normalized 
difference in transcript abundance between the presence and absence of cichlid 
cues) against the change in plasticity between the source and introduction 
populations. Inset shows the distribution of the Spearman rank correlations 
between the magnitude of plasticity in the ancestral population and the change 
in plasticity in the introduction populations from 1,000 permutated correlation 
values for the 135 concordantly differentially expressed transcripts, with the 
arrow indicating the observed correlation, which is substantially more negative 
than all permuted values. 


previous introduction experiments have demonstrated strong dir- 
ectional selection and rapid adaptation in response to low-predation 
environments”*”*. If traits exhibiting non-adaptive plasticity are under 
stronger directional selection, then newly established populations will 
probably face a dual challenge if they are to persist and avoid extinc- 
tion. First, they must overcome the fitness costs associated with strong 
directional selection on non-adaptive responses, including declines in 
population size; and second, they must harbour enough genetic vari- 
ation to rapidly respond to selection’’®’*. Because heritable genetic 
variation for transcription appears to be common”, the potential for 
rapid adaptation may ameliorate one set of costs. However, other costs 
may be more difficult to avoid, as models suggest that population size, 
the distance a non-adaptive trait is from the local optimum, and the 
relationship of that trait to fitness will ultimately determine whether 
populations persist’®”*. In the case of the introductions here, such costs 
may have been reduced, because individuals were transplanted to 
relatively more ‘benign’ conditions, such that high predator-induced 
mortality was replaced with increased competition, reduced food 
availability, and other environmental factors characterizing the low- 
predation streams”. 

Understanding the role of phenotypic plasticity in adaptive evolu- 
tion remains a contentious problem in evolutionary biology, in part 
because few studies have been able to capture the initial patterns of 
plasticity and subsequent adaptive divergence of traits in natural popu- 
lations. Nevertheless, it is during the early stages of adaptive divergence 
that selection in new environments is likely to be strongest”"®”*, and 
when plasticity will either reduce or exacerbate the initial mismatch 
between the mean and optimal phenotypic responses*°. Recent work 
in these same guppy populations documents a similar pattern in which 
non-adaptive plasticity potentiates a rapid evolution of growth rate”, 
suggesting a general pattern that extends to other phenotypic traits. 
While such results are consistent with many models of how selection 
acts on phenotypes®”’, the role of non-adaptive plasticity in adaptive 
evolution remains understudied, despite arguments that it may be a 
common, but cryptic, form of evolution’’’*. More generally, under- 
standing when and how plasticity affects evolutionary response is 
critical for predicting the short- and long-term effects of envir- 
onmental change on organisms. Predictive evolutionary models of 


©2015 Macmillan Publishers Limited. All rights reserved 


phenotypic plasticity also have practical importance. For example, 
disease states within organisms respond plastically to treatments 
and also evolve, thus gene expression profiles can be used (as was 
done here) to predict how response to treatment influences disease 
progression”. Additional experimental evolution studies, especially 
those conducted in natural environments, will be critical for validating 
and parameterizing future models of how plasticity influences evolu- 
tionary change. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 4 April; accepted 3 August 2015. 
Published online 2 September 2015. 


1. West-Eberhard, M. J. Developmental Plasticity and Evolution (Oxford Univ. Press, 
2003). 

2. Schmalhausen, |. |. Factors of Evolution: the Theory of Stabilizing Selection (Blakiston, 
1949). 

3. Ldpez-Maury, L, Marguerat, S. & Bahler, J. Tuning gene expression to changing 
environments: from rapid responses to evolutionary adaptation. Nature Rev. 
Genet. 9, 583-593 (2008). 

4. Ghalambor, C. K., McKay, J. K., Carroll, S. P. & Reznick, D. N. Adaptive versus non- 
adaptive phenotypic plasticity and the potential for contemporary adaptation in 
new environments. Funct. Ecol. 21, 394-407 (2007). 

5. Baldwin, J. M. Development and Evolution (Macmillan Company, 1902). 

6. Ancel, L. W. Undermining the Baldwin expediting effect: does phenotypic plasticity 
accelerate evolution? Theor. Popul. Biol. 58, 307-319 (2000). 

7. Price, T.D., Qvarnstrém, A. & Irwin, D. E. The role of phenotypic plasticity in driving 
genetic evolution. Proc. R. Soc. Lond. B 270, 1433-1440 (2003). 

8. Paenke, |., Sendhoff, B. & Kawecki, T. J. Influence of plasticity and learning on 
evolution under directional selection. Am. Nat. 170, E47-E58 (2007). 

9. Lande, R. Adaptation to an extraordinary environment by evolution of phenotypic 

plasticity and genetic assimilation. J. Evol. Biol. 22, 1435-1446 (2009). 

0. Chevin, L-M., Lande, R. & Mace, G. M. Adaptation, plasticity, and extinction ina 
changing environment: towards a predictive theory. PLoS Biol. 8, e1000357 
(2010). 

1. Grether, G. F. Environmental change, phenotypic plasticity, and genetic 
compensation. Am. Nat. 166, E115-E123 (2005). 

2. Conover, D. O., Duffy, T. A. & Hice, L.A. The covariance between genetic and 
environmental influences across ecological gradients: reassessing the 
evolutionary significance of countergradient and cogradient variation. Ann. NY 
Acad. Sci. 1168, 100-129 (2009). 

. Wright, S. Evolution in Medelian populations. Genetics 16, 97-159 (1931). 

. Waddington, C. H. Genetic assimilation. Adv. Genet. 10, 257-293 (1961). 

. Suzuki, Y. & Nijhout, H. F. Evolution of a polyphenism by genetic accommodation. 
Science 311, 650-652 (2006). 

6. Schaum,C.E.& Collins, S. Plasticity predicts evolution in marine algae. Proc. R. Soc. 

Lond. B 281, 20141486 (2014). 

7. Losos, J. B. etal. Evolutionary implications of phenotypic plasticity in the hindlimb 

of the lizard Anolis sagrei. Evolution 54, 301-305 (2000). 


aw 


LETTER 


18. Wund, M.A, Baker, J.A., Clancy, B., Golub, J. L. & Foster, S. A. A test of the “flexible 
stem” model of evolution: ancestral plasticity, genetic accommodation, and 
morphological divergence in the threespine stickleback radiation. Am. Nat. 172, 
449-462 (2008). 

19. McCairns, R. J. & Bernatchez, L. Adaptive divergence between freshwater and 
marine sticklebacks: insights into the role of phenotypic plasticity from an 
integrated analysis of candidate gene expression. Evolution 64, 1029-1047 
(2010). 

20. Scoville, A. G. & Pfrender, M. E. Phenotypic plasticity facilitates recurrent rapid 
adaptation to introduced predators. Proc. Natl Acad. Sci. USA 107, 4260-4263 
(2010). 

21. Willing, E.-M. et al. Genome wide single nucleotide polymorphisms reveal 
population history and adaptive divergence in wild guppies. Mol. Ecol. 19, 
968-984 (2010). 

22. Handelsman, C.A. etal. Predator-induced phenotypic plasticity in metabolism and 
rate of growth: rapid adaptation to a novel environment. Integr. Comp. Biol. 53, 
975-988 (2013). 

23. Gibson, G. & Weir, B. The quantitative genetics of transcription. Trends Genet. 21, 
616-623 (2005). 

24. Leder, E.H. etal. The evolution and adaptive potential of transcriptional variation in 
sticklebacks-Signatures of selection and widespread heritability. Mol. Biol. Evol. 32, 
674-689 (2015). 

25. Reznick, D.A., Bryga, H.& Endler, J. A. Experimentally induced life-history evolution 
ina natural population. Nature 346, 357-359 (1990). 

26. Reznick, D. N., Shaw, F. H., Rodd, F. H. & Shaw, R. G. Evaluation of the rate of 
evolution in natural populations of guppies (Poecilia reticulata). Science 275, 
1934-1937 (1997). 

27. Charmantier, A. et a/. Adaptive phenotypic plasticity in response to climate change 
in a wild bird population. Science 320, 800-803 (2008). 

28. Gomulkiewicz, R. & Holt, R. D. When does evolution by natural selection prevent 
extinction? Evolution 49, 201-207 (1995). 

29. Reznick, D., Butler, M. J., IV & Rodd, H. Life-history evolution in guppies. VII. The 
comparative ecology of high-and low-predation environments. Am. Nat. 157, 
126-140 (2001). 

30. Merlo, L. M., Pepper, J. W., Reid, B. J. & Maley, C. C. Cancer as an evolutionary and 
ecological process. Nature Rev. Cancer 6, 924-935 (2006). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported by grants from the National Science 
Foundation (DEB-0846175 to C.K.G., EF-0623632 to D.N.R,, and |OS-0934451 and 
10S-1354775 to K.A.H.). We thank C. Handelsman, K. Langin, D. Broder, E. Duval, 

|. Janowitz, E. Lange, A. Shah, J. Havrid, E. Kane and L. Angeloni for helpful comments on 
the study. Computing for this project was performed on the Spear cluster at the 
Research Computing Center at the Florida State University 


Author Contributions C.K.G., K.L.H. and K.A.H. planned and executed the study, E.W.R. 
reared the fish, E.K.F. collected the tissues K.A.H. analysed the gene expression data, 
D.N.R. planned and oversaw the field introduction experiments, and C.K.G. oversaw the 
laboratory experiments. All authors participated in writing the manuscript. 


Author Information The sequence data are available at the Sequence Reads Archive 
(SRA) under accession number SRP06236. Reprints and permissions information is 
available at www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to C.K.G. 
(cameron1@colostate.edu) or K.A.H. (kahughes@bio.fsu.edu). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 375 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


No statistical methods were used to predetermine sample size. The investigators 
were not blinded to allocation during experiments and outcome assessment. 
Study system and populations. Guppies are a model system in evolutionary 
biology because they provide an opportunity to study rapid adaptive evolution 
in the wild’*”*. In lowland rivers, guppies occur in diverse fish communities where 
they experience high mortality from a number of fish predators. In small upstream 
tributaries, guppies occur in simpler communities, typically co-existing only with 
the killifish (Rivulus hartii), which poses little risk to adult guppies resulting in a 
low predator-induced mortality rate*”®. Past research has shown that numerous 
life history, behavioural, and morphological traits vary between these contrasting 
environments, and that these differences can evolve rapidly following experi- 
mental introductions*”*. We sampled four populations of guppies within the 
Guanapo River drainage in the Northern Range Mountains of Trinidad, West 
Indies (Extended Data Fig. 1). The first population, hereafter referred to as HP, 
is a naturally occurring population subject to high predation in the lower Guanapo 
river drainage that contains a variety of predator species, including the common 
predator on guppies, the pike cichlid*’°. The second population, hereafter 
referred to as LP, represents a native low-predation population from the same 
drainage and was sampled from the upstream Taylor tributary of the Guanapo 
river, where guppies co-exist with only R. hartii. R. hartii are gape-limited omni- 
vores that prey primarily on juvenile guppies*”°. The remaining two populations 
were experimentally established in two low-predation tributaries (the Lower 
Lalaja, and the Upper Lalaja) within the Guanapo drainage. 

Introduction experiments. In March 2008, HP guppies were introduced into the 
Lower Lalaja (denoted as Introl) and Upper Lalaja tributaries (denoted as Intro2) 
of the Guanapo drainage”. The two introduction populations were established in 
100-m reaches of these small, first-order tributaries. The upper limit of the intro- 
duction reach on the Lower Lalaja was bounded by a waterfall, which was arti- 
ficially enhanced to prevent emigration and the establishment of populations 
above the streams receiving introductions. The upper limit of the Upper Lalaja 
introduction reach had a natural barrier. The lower limit of both introduction 
sites had natural barriers, which blocked immigration from downstream popula- 
tions of guppies. The streams below these downstream barriers were also 
guppy-free before our introduction and were separated from the main river by 
additional barriers. 

Each stream was stocked with 38 gravid females and 38 mature males. These fish 

had been collected as juveniles, reared to maturity in single sex groups, and then 
mated in groups of 4-5 males and 4-5 females per breeding group before intro- 
duction. To minimize the potential for founder effects and equalize genetic divers- 
ity in each stream, males and females from each breeding group were introduced 
into alternate streams. Doing so increased the effective population size of each 
population, because females retained the sperm from mating with one set of 
38 males, then were introduced and subsequently mated with a second set of 
38 males. As part of a separate experiment the riparian forest canopy was experi- 
mentally thinned in the Intro2 stream before the introductions*', but the two 
introduction streams were similar in all other respects. 
Laboratory breeding experiments. Laboratory populations used for the gene 
expression assays were second-generation laboratory fish that were originally 
derived from 30 adult females and 30 adult males collected from each of the HP, 
LP and two introduction populations (Intro 1, Intro 2) in March of 2009. This time 
period represented one year or 3-4 generations after the establishment of intro- 
duction populations. Fish were kept in 1.5-] tanks (Aquatic Habitats) connected to 
a custom-made recirculating system and maintained on a 12-h light cycle at 
25+1°C”*3, Fish were reared on standardized food levels adjusted weekly 
for age and number of individuals per tank (morning, Tetramin tropical fish flakes, 
Spectrum Brands, Inc.; afternoon, brine shrimp (nauplii of Artemia spp.), Brine 
Shrimp Direct). The quantity of food offered daily approximated ad libitum and 
was comparable to the high level of food administered in other studies™. 

We reared all wild-caught guppies for two generations under common garden 
conditions using a breeding design that retains the genetic variation of the original 
population, prevents inbreeding, and minimizes maternal and other envir- 
onmental effects**. The first generation (G1) line in the laboratory was derived 
from wild-caught juveniles and reared to maturity in the lab. Wild-caught gravid 
females were housed individually until parturition and their offspring were used to 
create G1 family lines. Females that did not give birth within about 30-35 days of 
capture were randomly crossed with a wild-caught male; however, no two females 
were crossed with the same male. The G1 offspring from each brood were housed 
separately until sexed, and then separated into single-sex tanks. Juvenile females 
(28-56 days) can be identified by the presence of melanophores in a triangular 
patch that appears on their ventral abdomens, which is absent in males*. Sexing 
was accomplished by anaesthetizing guppies in buffered MS-222 (0.85 mg ml’; 


ethyl 3-aminobenzoate methane sulfonic acid salt) (Sigma-Aldrich) and observing 
the melanophores under a microscope. Males are considered to be sexually mature 
when the apical hood grows even with the tip of their gonopodium; females usually 
mature within + 1-2 days of males**. Mature males and females from each family 
line were then randomly chosen and crossed to other families to produce the 
second generation (G2). Each G2 family was the product of a unique cross, to 
minimize inbreeding and maximize the genetic variation within each population. 

Within 24 h of birth, G2 full-sibling broods were randomly assigned to two 1.5-1 
tanks (2-10 full siblings per tank) that differed in exposure to chemical cues from a 
predator (reared with or without cues from a predator) using a split-brood design. 
Siblings reared with cues from predators were reared in recirculating units that 
housed a pike cichlid within the sump that supplied water to the tanks”**?*°, 
Chemical predation cues included both kairomones from the cichlid predator 
and alarm pheromones from the two guppies consumed daily by the cichlid. 
Guppies reared without cues from predators were housed in identical recirculating 
units without predators in the water supply. G2 juveniles were anaesthetized and 
sexed at 29days (see above). From each population, we randomly selected 
5-6 families to raise pairs of male siblings within each rearing treatment. 
RNA-sequencing. Focal animals were euthanized by immersion in ice water 
followed by rapid decapitation (LACUC approved protocol #12-3818A). Whole 
brains were collected by cutting the head sagittally down the centre line and 
removing all brain tissue. Brains were then flash frozen in liquid nitrogen and 
stored at —80°C until further processing. Tissue collection took <2 min per fish, 
fast enough to minimize changes in gene expression due to handling. Whenever 
possible, we combined brains from two full-siblings in the same treatment group 
to ensure we could obtain sufficient RNA for sequencing, while minimizing vari- 
ation among pooled individuals. To minimize temporal and circadian variation, 
we performed all dissections within 15 min after lights-on in the morning (fish 
were all kept on a 12:12 h light-dark cycle). In addition, gene expression levels at 
lights-on minimized expression differences in response to recent experiences. Our 
data thus represent baseline transcription levels. The age of the fish (118-154 days) 
and the timing of sampling were randomly distributed across populations. Because 
all dissections occurred within 15 min, no more than 8 individuals (1-2 families 
distributed in both treatments) could be sampled per day, and the order in which 
populations were sampled was randomized. 

RNA was extracted from whole brain tissue using Qiagen RNAeasy lipid extrac- 
tion kit. A separate sequencing library was prepared for each pooled family, using 
unique index sequences from the Illumina Tru-Seq RNA kit following manufac- 
turer’s instructions. Sequencing libraries were constructed and sequenced on three 
lanes of an Illumina HiSeq 2000 at the HudsonAlpha Genomic Services 
Laboratory (Huntsville, Alabama) in April 2012. In total, 32 samples were 
sequenced in 5 lanes (sample sizes that passed quality filters: n = 5 for HP reared 
with and without predators, n = 4 for Introl and Intro2 reared with and without 
predators, n = 3 for LP reared with predators, and n = 2 reared without preda- 
tors). We obtained 736,693,718 100-base pair (bp) reads that passed the machine 
quality filter, with 17,517,493 to 28,265,561-bp reads per sample, and average 
quality >35.6 for all samples. 

Sequencing reads were mapped to a high-quality brain-specific reference 
transcriptome for P. reticulata. We constructed the reference from a data set 
containing >450 million 100-bp paired-end reads, which were filtered for high- 
quality sequences and normalized in silico to compress the range in k-mer abund- 
ance. We used SeqMan NGEN 4.1.2 to perform the assembly, which contained 
41,347 contigs, N50 = 2,548, and recovered 63% of Tilapia (Oreochromis niloticus) 
Ensembl proteins (Release 70). Contigs from the assembly were annotated by 
blastx queries against SwissProt (database downloaded 6 October 2012), 
UniProt/Trembl (28 November 2012), and nr (11 December 2012). Default para- 
meters were used in the blastx queries, with e-value cut-off of 1 x 10°+. 

Reads were mapped to the reference assembly using Bowtie 2 v2.0.0 on a server 
running Red Hat Enterprise Linux version 6.5. We used a seed size of 20 bp, with 
no mismatches allowed in the seed (run options: -D 15 -R 2 -N 0 -L 20 -iS,1,0.75). 
We retained mappings with quality scores >30 (<0.001 probability that the read 
maps elsewhere in the reference) and kept only contigs represented by =1 count per 
million reads in at least three samples. After removing low-abundance transcripts, 
628,797,716 reads (85.3%) mapped to 37,493 unique contigs in the reference tran- 
scriptome. We used the number of reads mapping to each of those contigs along 
with TMM-normalized library sizes** to analyse differential expression. 

Data analysis. Between-group analysis (BGA) was conducted” as implemented in 
the R package made4 (ref. 37). BGA is a multivariate discriminant approach that is 
appropriate when the number of variables exceeds the number of cases; it is carried 
out by ordinating groups of samples and projecting the individual sample loca- 
tions on the resulting axes. We used principal components analysis (PCA) as the 
ordination method (Fig. 1). To quantify the rate of evolution along the axis 
separating the HP source population from the introduction populations, we cal- 


©2015 Macmillan Publishers Limited. All rights reserved 


culated evolutionary divergence in Haldanes**. We assumed a time of 3.5 genera- 
tions, and used the difference in mean transcript abundance in the no predator 
treatment with a pooled standard deviation** (see Extended Data Figs 2a, b). 

We used random permutation tests to evaluate differential expression across 
populations and treatment groups. Permuted data sets were generated by ran- 
domly reassigning entire RNA-seq samples among population and treatment 
categories to produce an empirical null distribution against which to test hypo- 
theses. This approach preserves any non-independence among transcripts that 
could bias inferences if the non-independence were not taken into account. We 
first computed transcript-specific test statistics from the actual data (see below) 
and compared that statistic to the distribution of the same statistic derived from 
250 permuted data sets. If the statistic for the real data fell within the extreme tails 
of the permuted values for that transcript, we called the transcript differentially 
expressed (DE). To determine if more transcripts were called DE than expected, 
we compared the number of DE transcripts in the real data set to the distribution of 
that number in the 250 permuted data sets. 

To determine if transcripts were significantly evolved in each introduction 
population we restricted the analysis to samples collected from fish reared without 
predator cues. For both the actual and the permuted data sets, a general linear 
model was applied separately to each transcript, with the normalized transformed 
number of reads as the dependent variable and population (HP and Introl or HP 
and Intro2, depending on the analysis) as a fixed effect. We then used general 
linear statistical models to assess divergence in the two introduction populations 
and the natural LP population (that is, HP versus Introl and HP versus Intro2 and 
HP versus LP) for each transcript. If the test statistic for each of the three contrasts 
fell in the extreme 5% of the distribution of the permutation test statistics, and the 
contrasts all had the same sign, we called the transcript concordantly differentially 
expressed (CDE). To calculate the number of transcripts expected to be called CDE 
in the two introduction populations under random expectations, we conducted 
this same analysis in each of the 250 permuted data sets, and calculated the number 
of transcripts meeting the same criteria. This permutation analysis accounts for 
any spurious associations that might result from comparing both introduction 
populations to the same ancestral HP population”. 

To test if the divergence in gene expression is greater than would be expected by 
neutral processes, we calculated Psy (a measure of phenotypic divergence between 
populations) from phenotypic variance components as in ref. 40, assuming 
h? = 0.5 (where h = heritability of a trait) for transcript expression level. This h? 
estimate is substantially higher than the average estimate from a recent analysis in 
sticklebacks™, making our comparison of Psy with published Fer estimates (Fer is 
a measure of genetic divergence between populations) conservative with respect to 
the hypothesis that divergence is greater than expected under genetic drift. 

We assessed the association between evolutionary divergence and ancestral 
plasticity in gene expression by conducting likelihood ratio tests of independence 
and comparing the resulting 7” value to the distribution of 7* values produced by 
conducting the same test on the 250 permuted data sets. Similarly, for the CDE 
transcripts, we calculated the Spearman rank correlation between evolution (mean 
change in expression level between HP and introduction populations in the 
no-predator-cue environment) and plasticity (mean change in expression in the 


LETTER 


HP ancestral population reared in the two predator-exposure environments), and 
compared that value to the distribution of values obtained from 1,000 random 
permutations of the population and treatment group labels. This permutation 
analysis accounts for any spurious correlation that can result because the calcula- 
tions for evolutionary divergence and plasticity share a common term (mean 
expression level in the HP source population reared without predator cues)”. 

For CDE transcripts, we quantified plasticity in the source population and in the 
introduced populations as the difference in the mean expression values (normal- 
ized log-transformed number of reads mapping to a given transcript) for each 
transcript in the two predator-cue treatment groups within each population. We 
then calculated the change in these plasticity values between the source and intro- 
duction populations and used a nonparametric sign test to determine if that 
change was significant. We evaluated the association between ancestral and des- 
cendant plasticity in the CDE transcripts using a Spearman’s rank correlation, and 
determined significance of that correlation using a random permutation test. 
Starting with the mean expression levels for each transcript within each popu- 
lation/treatment group, we randomly permuted the population/treatment labels 
1,000 times, recalculated ancestral and derived plasticity values for each transcript 
in each permutation, and calculated Spearman’s rank correlation of the permuted 
values. All statistical analyses were implemented in SAS 9.4 (SAS 2011) running in 
a Linux environment. 


31. Kohler, T. J., Heatherly, T. N. Il, El-Sabaawi, R. W., Zandona, E., Marshall, M. C., 
Flecker, A. S., Pringle, C. M., Reznick, D. N. & Thomas, S. A. Flow, nutrients, and light 
availability influence Neotropical epilithon biomass and stoichiometry. Freshwater 
Sci. 31, 1019-1034 (2012). 

32. Torres-Dowdall, J., Handelsman, C. A., Reznick, D. N. & Ghalambor, C. K. Local 
adaptation and the evolution of phenotypic plasticity in Trinidadian guppies 
(Poecilia reticulata). Evolution 66, 3432-3443 (2012). 

33. Ruell, E. W. et a/. Fear, food and sexual ornamentation: plasticity of colour 
development in Trinidadian guppies. Proc. R. Soc. Lond. B 280, 20122019 
(2013). 

34. Reznick, D. The impact of predation on life history evolution in Trinidadian 
guppies: genetic basis of observed life history patterns. Evolution 36, 1236-1250 
(1982). 

35. Robinson, M. D. & Oshlack, A. A scaling normalization method for differential 
expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010). 

36. Culhane, A. C., Perriere, G., Considine, E. C., Cotter, T. G. & Higgins, D. G. Between- 
group analysis of microarray data. Bioinformatics 18, 1600-1608 (2002). 

37. Culhane, A.C., Thioulouse, J., Perriére, G. & Higgins, D. G. MADE4: an R package for 
multivariate analysis of gene expression data. Bioinformatics 21, 2789-2790 
(2005). 

38. Gingerich, P. D. Rates of evolution on the time scale of the evolutionary process. 
Genetica 112-113, 127-144 (2001). 

39. Jackson, D.A.& Somers, K.M. The spectre of “spurious” correlations. Oecologia 86, 
147-151 (1991). 

40. Leinonen, T., Cano, J. M., Makinen, H. & Merila, J. Contrasting patterns of body 
shape and neutral genetic divergence in marine and lake populations of 
threespine sticklebacks. J. Evol. Biol. 19, 1803-1812 (2006). 

Al. Fitzpatrick, S. W., Gerberich, J. C., Kronenberger, J. A, Angeloni, L. M. & Funk, W. C. 
Locally adapted traits maintained in the face of high gene flow. Ecol. Lett. 18, 
37-47 (2015). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Intro 1 
Intro 2 


Low 


Predation 
PEEL 
High 
Predation 

0 1km 

ee 
Extended Data Figure 1 | Map of Trinidad where the experimental streams that lacked cichlids and guppies, Introl (left photograph) and Intro2 
transplants took place. Guppies were moved from a high-predation (HP) (right photograph). A naturally occurring guppy population without cichlids, 
locality where they coexist with cichlid predators and introduced into two low-predation (LP), was sampled to provide a low-predation reference. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


120 
100 
80 


60 


Count 


40 


20 


0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 


Rates of Evolution in Haldanes Between 
The Source Population and Intro 1 


200 
180 
160 
140 
120 
100 


Count 


0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 


Rates of Evolution in Haldanes Between 
The Source Population and Intro2 


Extended Data Figure 2 | Frequency histogram of Haldanes for the top 500 transcripts loading on PC2—the axis representing rapid evolutionary divergence 
between the source and introduction populations. a, Introl (median Haldane = 0.256, range = 0.07-0.74). b, Intro2 (median = 0.226, range = 0.10-1.68). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Uridine Phosphorylase 2 (upp2) 


CO 

ce) 

ts] 

he 

c 2.3 

2 

S 

ms 22 

oe 

— 

Cc 

= 

6 24 

xo) 

wv 

= 

3 

7) 2.0 —@ HP Source 

=| —F— Introduction 1 

EE —-#- Introduction 2 

D 41.9 

-4 predator no predator 
Treatment 


Extended Data Figure 3 | Ancestral plasticity and evolution in patterns of __ (Introl and Intro2). In this case the plastic response results in a decrease in 
gene expression for a representative gene: uridine phosphorylase 2 (upp2). _ expression, whereas the evolved response in the introduction populations is to 
Shown is the plastic response of the high-predation source population and increase expression, thus illustrating non-adaptive plasticity. 

the evolved responses in the two experimental introduction populations 


©2015 Macmillan Publishers Limited. All rights reserved 


Divergence in Gene Expression 
(log-transformed count per million reads) 


0.0 


LETTER 


0.5 1.0 


Ancestral Plasticity in Gene Expression 
(log-transformed count per million reads) 


Extended Data Figure 4 | Scatter plot of ancestral plasticity (change in 
transcript abundance to the absence of cichlid predator cues) and popula- 
tion divergence. Shown are the 565 transcripts that exhibited significant 
differences in expression between the predator and non-predator rearing 
treatments in the HP source population. We found a similar pattern as was 
found for the CDE transcripts (Fig. 2): 75% (424 out of 565) of the significantly 
plastic genes exhibited population divergence in the introduction populations 


in the opposite direction of plasticity (y* = 284.2, d.f. = 1). This result falls in 
the upper percentile of the 250 permuted 7’ values; median permuted 

values = 19.1, interquartile range = 6.7-50.8. Only eight transcripts were 
common to the data sets that were significantly evolved (CDE; Figs 2, 3) and 
significantly plastic, suggesting that short-term plastic responses and longer- 
term evolutionary responses involve largely different sets of genes. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Comparison of gene expression divergence (Ps7) with divergence of putatively neutral microsatellite loci (F57) 


Introl* Intro24 

Pst" 0.32 (0.21) 0.27 (0.21) Only CDE transcripts 
0.05 (0.11) 0.05 (0.12) Only non-CDE transcripts 
0.05 (0.11) 0.05 (0.10) All transcripts 

F 0.01 N/A 10 microsatellite loci 


*Quantitative divergence estimated by Psy, a phenotypic proxy for quantitative genetic divergence Qs7*°, calculated under the conservative assumption that half the within-population variation was heritable. 
Numbers in parentheses are standard deviations. 

Neutral divergence estimated from 10 microsatellite loci*?. 

“Divergence between the ancestral HP site (Guanapo) and the Intro1 site (Lower Lalaja). 

“Divergence between the ancestral HP site (Guanapo) and the Intro2 site (Upper Lalaja). 


©2015 Macmillan Publishers Limited. All rights reserved 


od es 


doi:10.1038/nature14907 


A new cyanogenic metabolite in Arabidopsis 
required for inducible pathogen defence 


Jakub Rajniak', Brenden Barco’, Nicole K. Clay? & Elizabeth S. Sattely' 


Thousands of putative biosynthetic genes in Arabidopsis thaliana 
have no known function, which suggests that there are numerous 
molecules contributing to plant fitness that have not yet been discov- 
ered'”. Prime among these uncharacterized genes are cytochromes 
P450 upregulated in response to pathogens**. Here we start with a 
single pathogen-induced P450 (ref. 5), CYP82C2, and use a combina- 
tion of untargeted metabolomics and coexpression analysis to 
uncover the complete biosynthetic pathway to 4-hydroxyindole-3- 
carbonyl nitrile (4-OH-ICN), a previously unknown Arabidopsis 
metabolite. This metabolite harbours cyanogenic functionality that 
is unprecedented in plants and exceedingly rare in nature’; further- 
more, the aryl cyanohydrin intermediate in the 4-OH-ICN pathway 
reveals a latent capacity for cyanogenic glucoside biosynthesis*” in 
Arabidopsis. By expressing 4-OH-ICN biosynthetic enzymes in 
Saccharomyces cerevisiae and Nicotiana benthamiana, we reconstit- 
ute the complete pathway in vitro and in vivo and validate the func- 
tions of its enzymes. Arabidopsis 4-OH-ICN pathway mutants show 
increased susceptibility to the bacterial pathogen Pseudomonas 
syringae, consistent with a role in inducible pathogen defence. 
Arabidopsis has been the pre-eminent model system’°" for studying 
the role of small molecules in plant innate immunity”; our results 
uncover a new branch of indole metabolism distinct from the canon- 
ical camalexin pathway, and support a role for this pathway in the 
Arabidopsis defence response’’. These results establish a more com- 
plete framework for understanding how the model plant Arabidopsis 
uses small molecules in pathogen defence. 

To identify cytochromes P450 potentially involved in the biosyn- 
thesis of novel defence-associated small molecules, we obtained raw 
data sets for all transcriptomics experiments dealing with biotic stress 
in A. thaliana from the NASCArrays database. We examined CYP 
genes present in the probeset and selected a candidate, CYP82C2, that 
is highly expressed under a variety of pathogen treatment conditions, 
but whose native function in Arabidopsis is unknown (Fig. 1a). 

To identify small molecules whose levels change in a CYP82C2- 
dependent manner, we performed comparative metabolomics™ with 
a homozygous transfer-DNA insertion line of CYP82C2. We used the 
bacterial pathogen P. syringae pv. tomato DC3000 harbouring the 
avrRpm1 avirulence gene (Psta) as an elicitor since CYP82C2 express- 
ion is strongly upregulated 24h after inoculation with this strain 
(Fig. 1a). We analysed tissue methanolic extracts of 11-day-old seed- 
lings grown hydroponically in the presence of Psta by liquid chromato- 
graphy—mass spectrometry (LC-MS), and computationally compared 
mutant and wild-type (WT) Col-0 metabolomes. From this analysis, 
we identified 11 compound mass signals that reproducibly and signifi- 
cantly differ between WT and cyp82C2 (Fig. 1b); these mass ions are 
induced after pathogen elicitation and are not bacterially derived 
(Extended Data Fig. 1a). 

We next sought to obtain clues about the structure of these com- 
pounds from their tandem mass spectra (MS/MS). MS/MS analysis 
revealed that the 11 compounds could be divided into two classes 
(A and B in Fig. 1b), assigned as indole-3-carboxaldehyde (IAL) deri- 


vatives with (B) and without (A) hydroxylated indole systems. 
Moreover, the fact that the cyp82C2 mutant lacked all the hydroxylated 
derivatives but accumulated excess amounts of their non-hydroxylated 
counterparts suggested that CYP82C2 acts as an indolic hydroxylase. 
However, except for compound A1 (Fig. 2b), which was confirmed to 
be indole-3-carboxylic acid methyl ester, the structures of these com- 
pounds remained elusive. 

To facilitate structural analysis, we investigated whether any of these 
compounds were exuded into the medium in the cyp82C2 mutant 
seedling experiments (Fig. 1d). Filtered spent medium was loaded onto 
a C18 silica gel cartridge, and non-polar metabolites were eluted with 
acetonitrile and analysed by LC-MS. Surprisingly, the profile of spent 
medium extracted in this manner was notably different from that 
of tissue methanolic extracts: while small amounts of A2-A7 were 
present, no Al could be detected; instead, a new ultraviolet-active 
compound with m/z = 171.0553 [M+ H]* dominated the LC-MS 
trace (Fig. 1d). NMR analysis of this compound followed by compar- 
ison with a synthetic standard established its identity as the novel 
metabolite indole-3-carbonyl nitrile (ICN) (Fig. lc and Extended 
Data Fig. 2). 

Chemically, the most striking feature of ICN is the presence of a 
highly reactive «-ketonitrile moiety that, to our knowledge, has not 
been found in any plant natural product; however, benzoyl cyanide 
has been previously identified in the secretions of millipedes®’. The 
a-ketonitrile is susceptible to nucleophilic attack, resulting in the dis- 
placement of cyanide ion: in alkaline aqueous solution, ICN degrades 
to indole-3-carboxylic acid (ICA) (an alternative route to ICA in 
Arabidopsis has been reported’); in methanol, ICA methyl ester 
(A1) is formed instead, explaining the presence of A1 and the absence 
of ICN in methanolic extracts (Fig. 1c). Modifying the tissue extraction 
procedure by using an acidified 1:1 acetonitrile/water mixture enabled 
direct detection of ICN by LC-MS; additionally, when deuterated 
methanol was used, only the deuterated form of Al was observed 
(Extended Data Fig. 1b-e). On the basis of its molecular formula and 
the synthesis of an authentic standard, A6 was shown to be a serine- 
ICN addition product (see Fig. 2b). However, in the presence of 
cysteine and structurally related compounds, ICN can undergo a spon- 
taneous cycloaddition, resulting in the formation of a thiazoline ring 
and the net loss of ammonia. This last observation allowed us to deter- 
mine the structures of and synthesize standards for compounds A2- 
A5, which are the cycloaddition products of ICN and cysteine (A4) or 
Cys-Gly dipeptide (A2) and their thiazole analogues (A5 and A3, 
respectively; see Fig. 2b, Extended Data Fig. 3, and Supplementary 
Table 1). 

The absence of the hydroxylated analogues B1-B6 in the cyp82C2 
insertion line pointed to ICN as the likely substrate for this enzyme. 
Incubation of ICN with yeast-expressed CYP82C2 yielded only a trace 
amount of hydroxylated ICN, but a significant amount of 4-hydroxy- 
ICA (4-OH-ICA) (structure shown in Fig. 3), as confirmed by NMR 
spectroscopy and comparison with a synthetic standard (Extended 
Data Fig. 4a—d). Since CYP82C2 shows no activity on ICA, we deduced 


1Department of Chemical Engineering, Stanford University, Stanford, California 94305, USA. @Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut 


06511, USA. 


376 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


cy 
s 


Elicitors 


LETTER 


Figure 1 | Transcriptomic and metabolomic 
analyses implicate CYP82C2 in the biosynthesis 


| ‘ “ m cyp82C2 + Psta of novel pathogen defence-related secondary 
4 a3 S mWT + Psta metabolites. a, Heat map of relative gene 
=< Me x6 expression levels for cytochrome P450 genes in 
8 S& 8 Arabidopsis under various pathogen stress 
E Cees 34 conditions. The enlarged map shows the top 10 
8 CYP71B15 3 . I P450 genes after sorting by mean expression level 
o eles 22 | over all conditions. Cytochromes P450 in grey have 
eee al 2 (| : 1 previously been biochemically characterized. 
cypaace 0 | ia a b, Levels of the most significantly differing 
CYP79B2 A1 A2 A3 A4 AS AG A7 Bi B2 BS B4 BS BO metabolites identified in seedling comparative 
=~ d Compound ICN metabolomics experiments with cyp82C2. Data 
4-20 2 4 = | represent the mean + s.d. of six biological 
log,(fold change vs control) E 4-OH- ICN | replicates. c, ICA methyl ester (A1) and 4-OH-ICA 
c 8 va methyl ester (B1) are methanolic degradation 
. 5 MeOH 7 . = Camalexin 1 || products of ICN and 4-OH-ICN. d, High- 
extraction 2 7 I performance liquid chromatography traces of 
co cor 8 | \ \ || cyp82C2 growth medium for WT and cyp82C2 seedlings, 
N \ N 8 i showing Psta-dependent accumulation of ICN and 
ICN R=H A1R=H < -OH- 
4-OH-ICN R=OH HON 1R=OH : Wromock 4-OH-ICN. 
5 10 15 20 


Retention time (min) 


that CYP82C2 converts ICN to 4-OH-ICN, competing with hydrolysis 
of ICN to ICA (Extended Data Fig. 4e, f). Further experiments with 
chemically synthesized 4-OH-ICN showed that its half-life is approxi- 
mately 3 min in aqueous solution at pH = 7.5 (Supplementary Table 
2), rendering direct isolation of the 4-OH-ICN product infeasible. 
Chemical synthesis of 4-OH-ICN further enabled the synthesis of 
the 4-hydroxy derivatives of A1-A6, confirming that these correspond 
to compounds B1-B6 seen in WT tissue extracts (Extended Data Fig. 3 
and Supplementary Table 1). Therefore, all the metabolites identified 
in our initial metabolomics experiment with cyp82C2 are ultimately 
derived from ICN, whether as artefacts of the extraction (Al and B1), 
or as in vivo addition products (A2-A7, B2-B6). 

We next investigated the biosynthesis of ICN, using the CYP82C2 
gene as bait for coexpression analysis. For our pathogen data set, the 
CYP79B2 gene, whose encoded enzyme converts tryptophan (Trp) into 
indole-3-acetaldoxime (IAOx)’*, has the second highest correlation 
(Pearson’s r) with CYP82C2 among all genes profiled (Supplementary 
Table 3). We performed a metabolomic analysis of the cyp79B2 cyp79B3 


double knockout line’, which is deficient in [AOx production. No ICN- 
derived metabolites are produced in this mutant (Fig. 2a), indicating that 
ICN is derived from IAOx. 

In searching for the enzyme(s) responsible for further conversion of 
IAOx to ICN, we postulated a biosynthetic route paralleling that of the 
cyanogenic glycoside dhurrin®: a CYP79-catalysed formation of an 
aldoxime, followed by a CYP71-catalysed formation of a cyanohydrin 
intermediate. In the dhurrin pathway, the cyanohydrin is glucosylated 
to yield the final product, whereas in ICN biosynthesis, a final dehy- 
drogenation is required to produce an «-ketonitrile (Fig. 2c). 

Correlation analysis implicated CYP71A12, a P450 linked to cama- 
lexin biosynthesis"’, as the most likely candidate gene for the cyano- 
hydrin formation step (Supplementary Table 3). Profiling of the 
cyp71A12 transfer-DNA insertion line, as well as transfer-DNA inser- 
tion lines of its two closest Arabidopsis homologues, CYP71A13 and 
CYP71A18, demonstrated that the CYP71A12 gene is in fact probably 
responsible: all ICN derivatives with the exception of A6 are at ~10% 
of WT levels in the cyp71A12 mutant, but unaffected in the cyp71A13 


Figure 2 | Targeted metabolic profiling of 


A1 A2 A3 AG A7 Bi B4 B5 BG 9 : ‘ arr 
cyp79B2/B3 [ll a mes Lo er candidate transfer-DNA insertion lines helps 
B R=OH Co | . uncover the entire ICN biosynthetic pathway. 
Ne AUB a, Heat map of mean ICN-derived metabolite levels 
a ° 0 ° a relative to WT in Psta-elicited transfer-DNA 
Cuts - A Cor insertion lines. Mutants in bold have significantly 
at1g26390 (fox: AN S Hr o~F>n S by oO decreased levels of ICN derivatives. Note that 
at1g26400 (fox: H  A2/B2 H A3/B3 A6 level ffected to th 
at1g26410 (fox é 5 HO A HO 6 levels are not affected to the same extent as 
at1g26420 (fox; i oO yo levels of other metabolites in any line except for 
ae ~ e. vo Kk TF Tr cyp79B2/B3, hinting at an alternative biosynthetic 
gopt Bo Nf N 2 OH A N : Ob route from IAOx for this metabolite. b, Structures 
| ; A4/B4 AS/BS of all ICN derivatives, confirmed by comparison 
og,(fold change vs WT): R ° ° ° fa) : : i 
-2 0 2 with synthetic standards (see Extended Data Fig. 3 
= io) OH = sy ton and Supplementary Table 1). c, Proposed 
Z NH. fA NHz j 7 oi e 
c \F N” agye6 N” az biosynthetic pathway from Trp to 4-OH-ICN and 
MR Pisses Function based on metabolomics data downstream metabolites. 
«<«— /nvitro biochemical characterization 
CYP79B2 CYP71A12 OH 
Bi le Z<oy (At2g30750) = 
N L-Trp N |AOx N Indole FOX1 
cyanohydrin (Attg26380) 
OH 0 GGP1 OH fo) 
O CYP82C2 
aN. (At4g30530) 7 Buea 
yoog ie T Sv 
N H Oi Re N 
HB Da H  4-OH-ICN 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 377 


©2015 Macmillan Publishers Limited. All rights reserved 


Core cor, ICA 


'N’ 
H 
' 
1 
1 
1 
1 


4 


IAOx 


Equimolar standards 


———— 


Empty vector control 
— CYP71A12 


A cyp71at12 + FOX! 
a CYPTIAT2 + FOX1 + CYP82C2 
10. 12° «14 ~«16~S=C«w18~S:«0 


Retention time (min) 


Figure 3 | In vitro reconstitution of 4-OH-ICN biosynthesis from IAOx. 
Combined extracted ion chromatograms (EICs) for [AOx substrate and 
reaction products for various subsets of enzymes in the 4~-OH-ICN pathway; 
4-OH-ICN could not be detected directly and its hydrolysis product 4-OH-ICA 
is shown instead. 


and cyp71A18 mutants (Fig. 2a). Levels of camalexin and other indolic 
metabolites were only slightly changed in whole-seedling tissue 
extracts of the cyp71A12 mutant (Extended Data Fig. 5c). 

Further correlation analysis using CYP71A12 as bait revealed a clus- 
ter of five tandemly arrayed homologous genes, At1g26380-At1g26420, 
that are highly coexpressed with CYP71A12 (Supplementary Table 3). 
Atlg26380 encodes a flavin-dependent oxidoreductase known as 
FOX] (ref. 19). We profiled the corresponding homozygous transfer- 
DNA insertion lines for these genes and found a three- to fivefold 
reduction in levels of ICN metabolites in the fox1 mutant, with no 
significant changes observed for the other mutants (Fig. 2a). 
Additionally, we observed a build-up of IAL, the expected hydrolysis 


product of the indole-3-cyanohydrin intermediate (Extended Data 
Fig. 5d). More strikingly, the fox] mutant accumulates new mass 
signals corresponding to indole cyanogenic glycosides (ICGs), not 
previously observed in plants (Extended Data Fig. 6a-e, structures 
shown in Fig. 4d). Cyanogenic glycoside compounds are widely dis- 
tributed in the plant kingdom, but have not yet been detected in 
Arabidopsis°. Disruption of the ICN pathway at the FOX1-catalysed 
step therefore leads to capture of some portion of the cyanohydrin 
intermediate by non-specific glycosyltransferases, exactly paralleling 
dhurrin synthesis®. 

We sought to confirm the proposed biochemical transformations 
(Fig. 2c) by reconstituting the complete pathway in vitro. A combination 
of yeast microsomal CYP71A12 and CYP82C2 and N. benthamiana- 
expressed FOX1 was sufficient to catalyse the conversion of [AOx to 
ICN, as illustrated in Fig. 3; the production of 4-OH-ICN is inferred 
from the accumulation of 4-OH-ICA. We also reconstituted the bio- 
synthesis of 4-OH-ICN in the heterologous host N. benthamiana, 
using transient expression of the four pathway genes necessary for 
production of 4-OH-ICN from Trp via Agrobacterium-mediated 
transient transformation”. We observed significant accumulation of 
B1 (from methanol extraction of 4-OH-ICN) only when all pathway 
genes were present; however, we also noted background levels of ICA 
and IAL when only early pathway genes were expressed (Extended 
Data Fig. 7). Notably, when we expressed CYP79B2 and CYP71A12 
but not FOX1, we again observed the accumulation of ICG mass 
signals (Extended Data Fig. 6f). 

The Trp-derived metabolites camalexin and 4-methoxy indol-3- 
ylmethylglucosinolate (4-methoxyglucobrassicin) have been shown 
to play a key role in Arabidopsis immunity (Fig. 4d)'°''"***. To evalu- 
ate whether 4-OH-ICN pathway products also contribute to 
Arabidopsis disease resistance, we challenged 4-OH-ICN biosynthetic 
mutants with a diverse panel of pathogens. Using surface inoculation 
to mimic the natural infection process, we found that, compared with 


a b 
95 ee 
= ‘ =. = Water 
8 AS 
= g flg22 
6 = 5 d 
abcd 
ra 7 rS abc r 
= 6 o 4 
= 
e 5 T 
) » 34 
oe E 
a 3 @ 2 
3 
ae 24 
3 1 S 
£9 D0 
cyp79B2 cyp71A12 cyp82C2 GuHAgs pad3-1 fls2 = cyp71A12 cyp82C2 cyp71A13  pad3-1 fls2 
cyp79B3 
c 
= is oO 
§ 8 Cua cae ————- 1 
2 US=! 
a 7 (Cr. es \H figs Q oy RO OH 
ast N HO SN De ead Po 
§ 6 L-Trp CYP83B1 ‘OH °F 
— a OP 
Gt a5 |ovroces ——— I I] Glucosinolates ee 
e Sa jl Sn 
04 H N 
$ CYP71A13; . H 
3 AN on \ 
& ORT a & ite: 
32 IAOx > | | : cyanogenic glycosides 
bar A Camalexin 
3 4 CYP71A12; N accumulate in 
BS FOX1; _ 0 at1g26380 (fox?) 
fe} 0 CYP82C2 oOo 
WT + WT + WT + = ( ei T “Sw R=Hor aL 
solvent ICN  4-OH-ICN saree AN 4OH-ICN ‘OH 
control " 


Figure 4 | Camalexin and CYP82C2-synthesized 4-OH-ICN contribute 
non-redundantly to disease resistance against the virulent bacterial 
pathogen P. syringae. a, Growth analysis of the virulent P. syringae pv. tomato 
DC3000 (Pst) in surface-inoculated adult leaves. Data represent the 

mean + s.e.m. of four biological replicates. Data points labelled with different 
letters are significantly different (P < 0.05, two-tailed f test); data points labelled 
with the same letter are not significantly different. WT, Col-0 ecotype; c.fiu., 
colony-forming units. b, Growth analysis of Pst in 10-day-old seedlings pre- 
treated with water or 1 1M bacterial MAMP flg22 for 6h. Data represent the 
median + s.e.m. of four biological replicates of 10-15 seedlings each. Different 


378 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


letters denote statistically significant differences (P < 0.05, two-tailed f test). 
c, Growth analysis of Pst in WT adult leaves pre-immunized with 1 1M 

flg22 and 100 uM ICN, 4-OH-ICN, camalexin or solvent control 
(dimethylsulfoxide (DMSO)) for 24h before infiltration with Pst. Data 
represent the median + s.e.m. of three biological replicates. Asterisk denotes 
statistical significance relative to WT (P < 0.01, two-tailed f test). Experiment 
was repeated three times, producing similar results. d, Summary of known 
major Trp-derived secondary metabolites in Arabidopsis and 

oxidative biosynthetic enzymes that have been used to reconstitute the 
pathways in vitro or in planta. 


©2015 Macmillan Publishers Limited. All rights reserved 


WT, the adult leaves of cyp71A12 and cyp82C2 are more susceptible to 
the virulent bacterial hemibiotroph Pst (P. syringae pv. tomato 
DC3000) and comparable to the immuno-deficient fls2 mutant, which 
cannot perceive the bacterial microbe-associated molecular pattern 
(MAMP) flg22 (refs 22, 23) (Fig. 4a). Similarly, seedlings of the 
4-OH-ICN pathway mutants are more susceptible to Pst than WT 
in the presence and absence of flg22 (Fig. 4b), indicating a role for 
4-OH-ICN in basal disease resistance against a bacterial pathogen. 
Notably, the adult leaves and seedlings of the camalexin pathway 
mutants cyp71A13 and pad3 are also more susceptible to Pst infection 
than WT (Fig. 4a, b), suggesting a previously unrecognized role for 
camalexin in the antibacterial defence response. To test for a direct role 
of the ICN pathway metabolites in the plant innate immune response, 
either as inducible antibacterial or signalling compounds, we mea- 
sured their protective effect against subsequent bacterial infection by 
infecting WT adult leaves with Pst after pre-immunizing them with 
pure compounds and flg22. Compared with a solvent control, pre- 
treatment with 4-OH-ICN (but not ICN or camalexin) conferred 
greater bacterial resistance (Fig. 4c), which supports a direct mech- 
anism of action for 4-OH-ICN in inducible plant defence. 

We also observed increased disease symptoms in adult leaves of 
the cyp82C2 mutant upon inoculation with spores from the avirulent 
fungal necrotroph Alternaria brassicicola (Extended Data Fig. 8e, f) 
and—consistent with a previous report**—the virulent necrotroph 
Botrytis cinerea (Extended Data Fig. 8a, b), but not from the obligate 
fungal biotroph Golovinomyces orontii (Extended Data Fig. 8c, d). 
Furthermore, purified ICN and 4-OH-ICN have a growth inhibitory 
effect on B. cinerea and A. brassicicola comparable to that of cama- 
lexin’®* (Extended Data Fig. 9). However, we cannot rule out the pos- 
sibility that the role of the 4-OH-ICN pathway in fungal defence is 
indirect, as adult leaves of the cyp82C2 mutant appear partly impaired 
in camalexin production after Alternaria treatment (Extended Data 
Fig. 10). 

The camalexin and 4-OH-ICN pathways rely on a pair of paralo- 
gous genes, CYP71A12 and CYP71A13, which are members of the 
CYP71 family linked to innovations in plant metabolism” (Fig. 4d). 
Strikingly, the 4-OH-ICN pathway resembles the widespread cyano- 
genic glucoside pathway that has been lost in the Brassicaceae, and 
appears to be a metabolic re-invention leading to a novel cyanogenic 
metabolite type derived from Trp’. It is possible that 4-OH-ICN acts 
in concert with other Trp-derived metabolites, each contributing to 
protection against overlapping sets of specific pathogens. Collectively, 
our data provide additional insight into the Arabidopsis defence res- 
ponse and, more generally, how plants use metabolic innovation to 
expand innate immunity. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 14 August 2014; accepted 14 July 2015. 
Published online 9 September 2015. 


1. Chae, L, Kim, T., Nilo-Poyanco, R. & Rhee, S. Y. Genomic signatures of specialized 
metabolism in plants. Science 344, 510-513 (2014). 

2. D'Auria, J.C. & Gershenzon, J. The secondary metabolism of Arabidopsis thaliana: 
growing like a weed. Curr. Opin. Plant Biol. 8, 308-316 (2005). 

3. Bednarek, P. & Osbourn, A. Plant-microbe interactions: chemical diversity in plant 
defense. Science 324, 746-748 (2009). 

4. Denoux, C. et al. Activation of defense response pathways by OGs and Flg22 
elicitors in Arabidopsis seedlings. Mol. Plant 1, 423-445 (2008). 


LETTER 


5. Bak, S. et al. Cytochromes P450. Arabidopsis Book 9, e0144 (2011). 

6. Jones, T. H., Conner, W. E., Meinwald, J., Eisner, H. E. & Eisner, T. Benzoyl cyanide 
and mandelonitrile in the cyanogenetic secretion of a centipede. J. Chem. Ecol. 2, 
421-429 (1976). 

7. Zagrobelny, M., Bak, S. & Maller, B. L. Cyanogenesis in plants and arthropods. 
Phytochemistry 69, 1457-1468 (2008). 

8. Gleadow, R. M. & Maller, B. L. Cyanogenic glycosides: synthesis, physiology, and 
phenotypic plasticity. Annu. Rev. Plant Biol. 65, 155-185 (2014). 

9. Tattersall, D. B. et al. Resistance to an herbivore through engineered cyanogenic 
glucoside synthesis. Science 293, 1826-1828 (2001). 

10. Bednarek, P. et al. A glucosinolate metabolism pathway in living plant cells 
mediates broad-spectrum antifungal defense. Science 323, 101-106 (2009). 

11. Clay, N. K., Adio, A. M., Denoux, C., Jander, G. & Ausubel, F. M. Glucosinolate 
metabolites required for an Arabidopsis innate immune response. Science 323, 
95-101 (2009). 

12. Dangl, J. L., Horvath, D. M. & Staskawicz, B. J. Pivoting the plant immune system 
from dissection to deployment. Science 341, 746-751 (2013). 

13. Ahuja, |., Kissen, R. & Bones, A. M. Phytoalexins in defense against pathogens. 
Trends Plant Sci. 17, 73-90 (2012). 

14. Vinayavekhin, N. & Saghatelian, A. in Current Protocols in Molecular Biology (eds 
Ausubel, F. M. et al.) Ch. 30 (2010). 

15. Béttcher, C. et al. The biosynthetic pathway of indole-3-carbaldehyde and indole- 
3-carboxylic acid derivatives in Arabidopsis. Plant Physiol. 165, 841-853 (2014). 

16. Mikkelsen, M. D., Hansen, C. H., Wittstock, U. & Halkier, B. A. Cytochrome P450 
CYP79B2 from Arabidopsis catalyzes the conversion of tryptophan to indole-3- 
acetaldoxime, a precursor of indole glucosinolates and indole-3-acetic acid. J. Biol. 
Chem. 275, 33712-33717 (2000). 

17. Zhao, Y. et al. Trp-dependent auxin biosynthesis in Arabidopsis: involvement of 
cytochrome P450s CYP79B2 and CYP79B3. Genes Dev. 16, 3100-3112 (2002). 

18. Millet, Y.A. et al. Innate immune responses activated in Arabidopsis roots by 
microbe-associated molecular patterns. Plant Cell 22, 973-990 (2010). 

19. Boudsocq, M. et al. Differential innate immune signalling via Ca?* sensor protein 
kinases. Nature 464, 418-422 (2010). 

20. Peyret, H. & Lomonossoff, G. P. The pEAQ vector series: the easy and quick way to 
produce recombinant proteins in plants. Plant Mol. Biol. 83, 51-58 (2013). 

21. Thomma, B. P., Nelissen, |., Eggermont, K. & Broekaert, W. F. Deficiency in 
phytoalexin production causes enhanced susceptibility of Arabidopsis thaliana to 
the fungus Alternaria brassicicola. Plant J. 19, 163-171 (1999). 

22. Gomez-Gdmez, L. & Boller, T. Flagellin perception: a paradigm for innate 
immunity. Trends Plant Sci. 7, 251-256 (2002). 

23. Zipfel, C. et al. Bacterial disease resistance in Arabidopsis through flagellin 

perception. Nature 428, 764-767 (2004). 

24. Liu, F.etal. The Arabidopsis P450 protein CYP82C2 modulates jasmonate-induced 

root growth inhibition, defense gene expression and indole glucosinolate 

biosynthesis. Cell Res. 20, 539-552 (2010). 

25. Nafisi, M. et a/. Arabidopsis cytochrome P450 monooxygenase 71A13 catalyzes 

the conversion of indole-3-acetaldoxime in camalexin synthesis. Plant Cell 19, 

2039-2052 (2007). 

26. Nelson, D.& Werck-Reichhart, D. A P450-centric view of plant evolution. PlantJ. 66, 

194-211 (2011). 

27. Maller, B. L. Functional diversifications of cyanogenic glucosides. Curr. Opin. Plant 
Biol. 13, 338-347 (2010). 

28. Rauhut, T. & Glawischnig, E. Evolution of camalexin and structurally related indolic 
compounds. Phytochemistry 70, 1638-1644 (2009). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank F. Ausubel, M. B. Mudgett, C. Khosla, A. Saghatelian, 

Y. Millet, C. Danna, S. Galanie, and members of the Sattely and Clay laboratories for 

advice on experiments and comments on the manuscript. We thank the Salk Institute 
Genomic Analysis Laboratory for providing the sequence-indexed Arabidopsis 
transfer-DNA insertion mutants. We thank G. Lomonossoff (John Innes Centre) for 
providing plasmid pEAQ. This work was supported by ROO GM089985 and DP2 
AT008321 (to E.S.S.), T32 GM008412-20 (to J.R.), and T32 GM007499-38 (to B.B.). 
The early stages of this work were supported by National Science Foundation grant 
MCB-0519898 and National Institutes of Health grant R37 GM 48707 (awarded to 
Fred Ausubel, Massachusetts General Hospital, Boston, Massachusetts, USA). 


Author Contributions J.R., B.B., N.K.C., and E.S.S. designed experiments. J.R. and B.B. 
performed experiments. J.R., B.B., N.K.C., and E.S.S. analysed data and wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to E.S.S. (sattely@stanford.edu) or 
N.K.C. (nicole.clay@yale.edu). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 379 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
6 
3x10 
© 
S 2x10 
i) 
+ O Col-0 
= Mock 
< ate: 4  Col-0 
c i 
5 j +Flg22 
Ai A2 A3 A4 Ad A6 AZ B1 B2 B3 B4 Bd B6 
b C CH,OH extraction 
[@) 
OQ 
od m/z [M + H]* = Oo 
i 
‘ 176.0706 
wn FES 08 375i oO ET WL 0 5 10 15 20 25 30 35 40 45 50 55 
35 176 06928 R . . . 
‘ CD,OD extraction etention time (min) 
: Se 
Os 122 05700 ee ee Ne = ee = _ 
ol, . oa enw me on or eee 1 f T T T T T T T T T T 1 
7 we ee eit Sea 77° 7 0 5 10 15 20 25 30 35 40 45 50 55 
Retention time (min) 
d € CH,OH extraction 


10) 
O, 
cn ©: m/z [M + H]* = S 
N 179.0894 


x105 |*ES! Scan (19 375 min) Frag=150 OV 140404_WT_+Pst_Tissue_1d (0) 5 1 0 1 5 20 25 30 35 40 45 50 55 


| Retention time (min 

us CD,0D extraction mn 

a 144 04447 oO 

ae 201.06967 uu 

“ - : cn oe l | | . i T T T T T T T T T T 1 
See een aeccaie 0 5 10 15 20 25 30 35 40 45 50 55 


Retention time (min) 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 1 | Elicitation of compounds identified in 
metabolomics screen by flg22 peptide and origin of ICA methyl ester as 
artefact of the methanol extraction method. a, Levels of compounds in 
Flg22-elicited Arabidopsis Col-0 seedling tissue, quantified as mean [M + H]* 
ion (m/z = 10 ppm) abundances extracted from raw data; error bars, s.d. based 
on three biological replicates. Production of these compounds in axenic 
plant culture demonstrates that they are plant-derived. b, c, Structure and mass 
peaks of ICA methyl ester (compound A1) seen in LC-MS analysis (b), and 
EICs for the expected m/z using a standard extraction with 80:20 


LETTER 


CH;3OH/H,0 or with 80:20 CD30D/D,0 (c). d, e, Structure and mass 
spectrum peaks seen for the triply deuterated A1 analogue (d), and EICs for the 
expected m/z using extraction with 80:20 CH30H/H,0, or with 80:20 CD30D/ 
D,0 (all EICs are to scale) (e). The presence of the deuterated analogue of ICA 
methyl ester and the complete absence of the non-deuterated compound in 
plant extracts when CD3OD is substituted for CH,OH show that the methyl 
ester is not a product of Arabidopsis metabolism, but arises because of the 
extraction method as a degradation product of ICN. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Plant-extracted Synthetic 


3 YR sant3 10M mere pre 


a BUR 12120, 04 ot MeCN PCSIONIRLES Z 
0.09 j 
d 0.39 
0.08 / 
0.07 ae 
0.09 | 02g 
0.08 | 
i 02d 
0.04 | 
i ong 
0.04 
4 od 
anne reece 
joo 95 90 85 60 75 70 65 60 55 50 45 40 35 30 25 20 45 10 Chemical Shit 
b B71 04 aoe Mec # YR nono OL H Men 
_ ma 
e 8 2 3 
10) 
008 
0.08 
004 
0.03 Os 
00d 
0.08 
Ls = ai { 
of = a 
ga 8382)~COT BOOB OTT "72 Chemical Shit aa "72 Chemical Shift ( 
Cc x10 2 |UV (21.526 min) 140404_CYP82C2_+Pst_Medium_3.d x10 2 |UV (21.065 min) 140715_ICN+4-OH-ICN_100uM.d 
6 45 
5 * \ 
35 | \ 
4 3 \ ~ 
A 25 : / 
3 2 | \ 
2 1.5: } \ 
1 | | 
1 \ 
~~ \ 0.5 ; \ 
0. _ 0. — eae 2 
0.5: 
on  . Bo pp py 
200 225 250 275 300 325 350 375 400 425 450 475 500 525 200 225 250 275 300 325 350 375 400 425 450 475 500 525 
mAU vs. Wavelength (nm) mAU vs. Wavelength (nm) 
d x104 |+ESI Product lon (21.004 min) Frag=150. 0V CID@20.0 (171.05499[z=1] > ™)_ x104 |+ESI Product lon (8.841 min) Frag=150.0V CID@20.0 (171.05530[z=1] > ™) 1_ 
117.05707 117.05711 
WW 100.00 7 100.00 
: 6 
0.9 
0.8 5. 
0.7 
0.6. 3 
0.5: 144.04373 3. 144.04387 
04. 32.64 31.93 
0.3 2 
0.2 4 
0.1 
° 
0 + . Se = — ———— ae 
20° «40 60 80 100 120 140 160 180 200 20 40 #60 80 100 120 140 160 180 200 
Counts vs. Mass-to-Charge (m/z) Counts vs. Mass-to-Charge (m/z) 


e Plant-extracted 


EIC 


0 5 10 15 20 25 30 35 40 45 50 55 
Retention time (min) 


Synthetic 


EIC 


0 5 10 15 20 25 30 35 40 45 50 = 55 
Retention time (min) 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 2 | Comparison of spectra for plant-extracted and 
synthetic compound establishes identity of ICN as new indolic metabolite 
produced by A. thaliana. a, Full-range (6 = 10.5 to —0.5) and, b, downfield 
region partial (5 = 8.5-7.0) 'H NMR spectra in CD3CN. Upfield contaminants 
in the full-range spectra are presumed to be residual solvent. c, Ultraviolet- 
visible absorbance spectra obtained via a diode array detector during LC 


LETTER 


analysis. Note that the prominent peak at 230 nm is due to acetonitrile in 

the LC mobile phase. d, Targeted MS/MS spectra for the parent ICN [M + H]* 
ion (m/z = 171.0550) at a collision energy of 20 V. See Supplementary Table 1 
for relative peak intensities at other collision energies. e, Aligned EICs for 
the ICN [M+ H]" ion for a Col-0 + Psta tissue sample extracted with DMSO 
and synthetic compound, showing identical retention times. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


4-OH-ICN 


EIC 


A1 B1 


oN _ 


. Col-0 +Psta 


18 19 20 21 22 23 
Retention time (min) 


A2 B2 A3 B3 


18 19 20 21 22 23 
Retention time (min) 


A4 B4 A5 B5 


Retention time (min) 


6 B6 A6’A7 


Oo 
iu 


_h awa Col-0 +Psta 
18 19 20 21 22 23 


Retention time (min) 


B6’ A?’ 


A 
moe Kf Synthetic 


a SSE, _Col-0 +Psta 


0 12 3 4 5 


6 7 8 9 10 11 12 13 14 15 16 


Retention time (min) 


Extended Data Figure 3 | Comparison of plant-extracted ICN derivatives, 
4-OH-ICN derivatives, and synthetic standards shows identical column 
elution times for all compounds. Col-0 + Psta combined EICs were extracted 


sample (all other traces), while synthetic EICs were extracted for a mixed 
standard in DMSO. Note that chromatograms are not to scale, and the 
synthetic standard is not equimolar with respect to all compounds because of 


for the relevant compound [M + H]* m/z values for a DMSO-extracted 
medium sample (4-OH-ICN trace), or a MeOH-extracted seedling tissue 


partial degradation. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


ICA 


a JR_130221_ICA_01 
2 
3 | 
N 
a 
E 
3 | | 
0.25 : i |) Ih 
iN | | 
Bt 80.79. 78. 77 76. 75 TH TBD THO 69. 68 67.66.65. Chemical Shift (ppm) 
b 3 “JR_130304_4-OH-ICA_4 4-OH-ICA 
a 
3S 
c 
E 
5 
= | | ; 
0.05 = | | | 
ilo | 
[| 
, 2 _ = UL UL 7 oe a 
“et 80ST OTB. 7574 73. 72 mM 70) 69. 68 67 66. 65 Ghemical Shift (ppm) 
Cc § _JR_130221_ICC-OH_1 CYP82C2 enzymatic 
2 
g i 
= reaction 
8 
a | 
E 
2 
0.0025 — | 
I 
“er go 79CUTB OUT BOSH OR OTD TOTO 6 OBO OG BS Chemical Shift (ppm) 
d 2 JR_130305_ICC-OH+80ug_4-OH-ICA_1 CYP82C2 enzymatic 
2 | . 
= | reaction + 4-OH-ICA 
# 
E 
5 | | 
0.0025 — | | 
| i\\ | 
arr 80. 79. 7s. 77SCO8 
e 4-OH-ICA 4-OH-ICN f 


m/z = 178.0499 m/z = 187.0502 


\ 
N 
CYP82C2 + ICN 


EIC 


14 15 


5 6 7 8 9 10 11 12 13 


OH ZN 
cyYP82c2 
\ 
NADPH NADP* N 
ICN 4-OH-ICN 
CYP82C2+ICA Ne HCN Ne HCN 
Q 
OH oH 20H 
Empty vector + ICN \ x \ 
H N 
ICA 4-OH-ICA 


Retention time (min) 


Extended Data Figure 4 | CYP82C2 is an ICN 4-hydroxylase. a, b, 'H NMR 
spectra in CD3OD of synthetic ICA (a) and 4-OH-ICA (b). ¢, Spectrum for 
large-scale enzymatic reaction extract of ICN incubated with CYP82C2. In 
addition to ICA, resulting from hydrolysis of ICN, peaks for a singly 
hydroxylated analogue of ICA are seen; these are qualitatively consistent with, 
but shifted slightly upfield (~30-60 Hz) from the 4-OH-ICA spectrum, 
possibly because of impurities or a pH effect in the enzymatic reaction sample. 
d, To confirm the identity conclusively, 80 jig of 4-OH-ICA dissolved in 
CD3OD was added to the enzymatic reaction NMR sample before acquiring 
another spectrum: no new peaks are seen, while the prior hydroxylated ICA 


peaks grow in intensity, establishing the product of the enzymatic reaction as 
4-OH-ICA. e, EICs for enzymatic reactions of CYP82C2 on ICN or ICA, or 
empty vector control incubation with ICN. Only trace amounts of the expected 
4-OH-ICN product but significant amounts of 4-OH-ICA are seen for the 
CYP82C2/ICN reaction. No hydroxylated products are seen for the CYP82C2/ 
ICA or empty vector/ICN reactions, indicating that CYP82C2 catalyses 

only the hydroxylation of ICN to 4-OH-ICN, but 4-OH-ICA is seen as 

the predominant end product due to rapid hydrolysis of 4-OH-ICN (f). 
Chromatograms in this figure were obtained using the 20 min LC-MS gradient 
(see Supplementary Information, Methods section 1.9 LC-MS analysis). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 
WT Mock cyp79B2 cyp79B3 +Psta 


3 2 -1 O 1 z2 3 3 -2 -1 O 1 2 2 
Log, ,(twr +Psta) Log, ,(twr +Psta) 


cyp71A12 +Psta fox! +Psta 
i 
CT 
Lt 
Al a | 
1] 
Lf 
| 
3 2 -1 O 1 2 3 


Log,,(Fwr +Psta) 


WT +Psta levels 


B1 


Cam 


0 1000 2000 3000 
Log, ,(twrt +Psta) Amount (nmol/g dry weight) 


Extended Data Figure 5 | Levels of numerous Arabidopsis indolic condition, quantified as [M + H]* ion abundances by LC-MS analysis with 
metabolites are altered in ICN pathway gene insertion lines compared with | XCMS processing, to levels in Psta-treated WT Arabidopsis seedlings. In 

WT plants. a-e, Relative compound levels for mock treatment condition f, absolute levels for all compounds except RA were quantified by measuring 
and indicated pathway insertion line mutants, and, f, absolute levels in Psta- [M + H]* ion abundances and comparing to standard curves. Error bars, s.d., 
treated WT (Col-0) seedlings. For a-e, data bars represent a logarithmically based on six biological replicates. Cam, camalexin; RA, raphanusamic acid; 
scaled ratio of mean metabolite levels in the indicated line or treatment other abbreviations as detailed previously. 


©2015 Macmillan Publishers Limited. All rights reserved 


EIC 


lon abundance 


ICG1 


ICG2 
m/z = 357.1094 m/z = 443.1070 


ceased 


fox1 +Psta 
8 9 10 11 12 13 14 15 16 
Retention time (min) 
x103 |*ESI Product lon (10.474 min) Frag=150.0V CID@40.0 (691.22101[z=1] -> **) 130827_Atlg2_ ICG1 MS/MS spectrum assignments 
2.6 TOD miz Predicted formula lon assignment mz Am/z 
24 168.04275 (observed) (theoretical) (ppm) 
_ 89-14 | 203.04995 330.09211 
2 82.60 78.75 691.2210  C3H3gN4NaQ,." [2M + Naj* 691.2220 -1.4 
ae 357.1066  CygHigN2NaOg” [M + Na]” 357.1057 2.5 
1.6 
14 330.0921 CisHi7NNaQg- [M + Na— HCN]" 330.0948 8.2 
Ea 203.0500 CeHi2NaO,* [Glucose + Na]* 203.0526 -12.8 
1 
08 eis | 185.0429 CgHypNaOs" [Glucose + Na—H,0]* 185.0420 «4.9 
_ 168.0428 CyH7NNaO* [M+Na-glucose —HCN+H,0]* 168.0420 4.8 
0.4 50.00326 ba aa 
re Bibs 146.0560 CgHgNO* [M+H-—glucose —HCN+H,O]° 146.0600 -27.4 
| 
0 pee eve st hth hte Spree Sen on el _ 5 
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 50.0033 CHNNa [HCN + Na] 50.0001 64.0 
Counts vs. Mass-to-Charge (m/z) 
x103 |+ESI Product lon (13.732 min) Frag=150.0V CID@20.0 (443. 10699[z=1]-> **) 130827_Atlg2_ ICG2 MS/MS spectrum assignments 
65 289.05346 
; 100.00 mz Predicted formula lon assignment mz Amz 
6 | 
55 (observed) (theoretical) (ppm) 
5 5 eaacs 443.1117 — CygH2ogN2NaQOg* [M + Naj" 443.1061 12.6 
45. 72.27 * 
; 289.0535 — CgHy4NaQg [O-malonyl glucose + Na] 289.0530 17 
35 ieaoaye 271.0445 = CoHy2NaQOg" [O-malonyl glucose +Na—H,O]" 271.0424 7.7 
45.00 
> 245.0663 CgH14NaQ7" [O-malonyl glucose + Na-—CO,]° 245.0632 12.6 
25 245.06633 
2 SiS 227.0531 CsH;2NaO,” O-malonyl glucose + Na—H,O — CO,] 227.0526 22 
au ary Ott 443.1173 168.0428 CyH7NNaO* = [M + Na-O-mal. gluc. — HCN + H20]* 168.0420 48 
1 
0.5 
25 50 75 100 125 120 195 200 295 250 275 300 385 360 375 400 405 450 475 
Counts vs. Mass-to-Charge (m/z) 
2108 2108 
8 
OICG1 © O1cG1 
© 
= 
1 x1 6 x 6 
0 DIcG2 F 1x10 HIcG2 
is] 
c 
2 
0 0 
WT cyp79B2 cyp71A12_ fox1 cyp82C2 WT Empty CYP79B2 CYP79B2 CYP79B2 CYP79B2 
Mock cyp79B3  +Psta +Psta +Psta +Psta vector CYP71A12 CYP71A12 CYP71A12 
+Psta FOX1 FOX1 
CYP82C2 


©2015 Macmillan Publishers Limited. All rights reserved 


N 
Il 
OH 
On O. 
HN - iy 
HO” ‘OH 
OH 

ICG1 

m/z [M + Na]* = 357.1057 


H 


ICG2 


LETTER 


N O 
Ill 
o* ~O 
Ox_-0 
HN Hl i 
HO” “OH 
OH 


m/z [M + NaJ* = 443.1061 


LETTER 


Extended Data Figure 6 | Putative ICGs observed in Arabidopsis and in 
N. benthamiana expressing ICN pathway enzymes. a, EICs for putative 
ICGs in WT Arabidopsis and fox mutant elicited with Psta. The m/z values 
shown are median values calculated by XCMS. b, Hypothesized structures and 
theoretical m/z values for the two ICGs identified. c, MS/MS spectrum for 
ICG1; m/z values and relative abundances are shown above each peak. The ion 
analysed here (m/z = 691.2210) represents a [2M + Na]* dimer that is 
significantly more abundant than the [M + Na]* ion. Direct analysis of the 
[M + Na]* ion (m/z = 357.1057) yielded low abundance spectra that could 
not be easily analysed. At lower collision energies, the [2M + Na]* ion 
fragments to [M + Na] + but yields a rich spectrum at 40 V, which is shown. 


Predicted peak assignments for the ICG1 MS/MS spectrum are shown in the 
accompanying table. For peaks in bold, exact counterparts could be identified in 
the dhurrin [M + Na]* 20 V MS/MS spectrum in the METLIN metabolite 
database. d, MS/MS spectrum obtained for the ICG2 [M + Na]* ion and 
predicted peak assignments. While the [2M + Na]* peak (m/z = 864.2225) is 
also seen for this compound (not shown), [M + Na]* is more abundant in 
this case, and was analysed directly. e, f, Levels of ICG1 and ICG2 in ICN 
pathway mutants (e) and in WT plants elicited with Psta and N. benthamiana 
expressing ICN pathway enzymes (f). For e and f, levels are quantified as mean 
[M + Na]~ ion (m/z + 10 ppm) abundances extracted from raw data; error 
bars, s.d., based on six biological replicates. 


©2015 Macmillan Publishers Limited. All rights reserved 


lon abundance (A1-A7, B1-B6, 4-OH-ICA) 


Empty vector control 


CYP79B2 


CYP79B2 + CYP71A12 


CYP79B2 + CYP71A12 + FOX1 


LETTER 


3x10! 
2x10! 


1x10’ 


(Vol “TvI ‘xOVI) souepunge uo| 


CYP79B2 + CYP71A12 + FOX1 + CYP82C2 


< 
Oo 
= 
9 
= 


©2015 Macmillan Publishers Limited. All rights reserved 


IAOx 


IAL 


ICA 


LETTER 


Extended Data Figure 7 | ICN pathway metabolites are produced in set of transiently expressed genes is indicated for each panel. Background levels 
N. benthamiana transiently expressing pathway genes. Levels of ICN and of ICA and IAL detected when only the early pathway genes CYP71A 12 and/or 
4-OH-ICN derivatives (left axis) and other relevant indolic compounds CYP79B2 are expressed indicate potential involvement of endogenous N. 
(right axis), quantified as mean [M + H] * ion (m/z + 10 ppm) abundances benthamiana enzymes. 


extracted from raw data; error bars, s.d., based on six biological replicates. The 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


b , 
WT 
cyp71A12 
cyp82C2 
cyp71A13 
pad3-1 * 
6 
Lesion diameter (mm) 
Cc d pad4-1 
cyp82C2 
WT 
0 20 40 60 80 
Conidiophores/leaf 
e f 
WT a 
cyp79B2 cyp79B3 b 
0 2 4 6 8 10 
lesion diameter (mm) 
WT \a 


WT acai cyp82C2 cyp71A13 pad3-1 
cyp71A12_ ——————4a 


cyp/1A13 b 


pad3-1 b 


oO 
NO 
& 
(o>) 


lesion diameter (mm) 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 8 | ICN pathway metabolites contribute to disease 
resistance towards B. cinerea but not towards G. orontii. a, Top: typical 
lactophenol trypan blue staining of leaves drop-inoculated with spores from the 
virulent fungal necrotroph B. cinerea to visualize the extent of host cell death 
(darkly stained areas within and beyond the fungal spore droplet region). 
Middle: microscopic analysis of stained leaves to visualize the extent of fungal 
colonization (stained filamentous fungal hyphae within and beyond the 
fungal spore droplet region). Images were taken at the same magnification 
(X25) and are representative of five biological replicates. Bottom: close-up 
images of the fungal hyphae beyond the fungal spore droplet region for cyp82C2 
and cyp71A13 mutants. Images were taken at the same magnification (< 100). 
b, Measurement of the disease lesion diameters in infected leaves. Data 
represent the median + s.e.m. for five biological replicates. Asterisks denote 
statistical significance relative to WT (P < 0.05, two-tailed t test). c, Typical 
lactophenol trypan blue staining of fungal conidiophores (spore-bearing 
structures) formed in leaves infected with the adapted powdery mildew 


G. orontii. The pad4-1 mutant is more susceptible to fungal growth by G. orontii 
and thus produces significantly more conidiophores. Images were taken at the 
same magnification (X 100) and are representative of three biological replicates. 
d, Measurement of the number of conidiophores in infected leaves. Data 
represent the mean + s.d. for three biological replicates. e, Top: typical disease 
symptoms 3 days after drop inoculation of leaves with spores from the avirulent 
fungal necrotroph A. brassicicola. Bottom: microscopic analysis of infected 
leaves after lactophenol trypan blue staining confirming that disease symptoms 
are consistent with extent of fungal colonization (lightly stained fungal hyphae 
extending from the fungal spore droplet region) and host cell death (darkly 
stained areas along and beyond the border of the spore droplet region). Images 
were taken at the same magnification (X25) and are representative of ten 
biological replicates. f, Measurement of the disease lesion diameters in infected 
leaves. Data represent the median + s.e.m. of eight (top graph) or ten biological 
replicates (bottom graph). Different letters denote statistically significant 
differences (P < 0.05, two-tailed t test). 


©2015 Macmillan Publishers Limited. All rights reserved 


OD6o0 (72 h) 


ODgo0 (72 h) 


Camalexin 


0 


) ty 


+ T + 
50 100 150 200 250 


Concentration (uM) 


Camalexin 


50 100 150 200 250 
Concentration (uM) 


©2015 Macmillan Publishers Limited. All rights reserved 


ODgo9 (72 h) 


ODgo0 (72 h) 


ODso0 (72 h) 


ODso0 (72 h) 


B. cinerea SF1 


ICN 


¢ 


0 


¢ + T + 
50 100 150 200 250 


Concentration (uM) 


ICA and KCN 


0 


50 100 150 200 250 
Concentration (uM) 


A. brassicicola FSU218 


ICN 


0.5 
0.4 
0.3 
0.2 
0.1 

0 


T T T + 
50 100 150 200 250 


Concentration (uM) 


ICA and KCN 
1 ¢ 
aly 
0 50 100 150 200 250 


Concentration (uM) 


ODgo0 (72 h) 


OD6o0 (72 h) 


ODg00 (72 h) 


ODgo0 (72 h) 


LETTER 


4-OH-ICN 


ed 4 


0 50 100 150 200 250 
Concentration (uM) 


4-OH-ICA and KCN 


Q 50 100 150 200 250 
Concentration (uM) 


4-OH-ICN 


0 50 100 150 200 250 
Concentration (uM) 


4-OH-ICA and KCN 


He fk 


O 50 100 150 200 250 
Concentration (uM) 


LETTER 


Extended Data Figure 9 | ICN and 4-OH-ICN but not their degradation 
products inhibit fungal growth in vitro. a, b, Fungal growth inhibition assays 
on B. cinerea SF1 (a) or A. brassicicola FSU218 (b) with the tested compound 
(or compound combination) indicated. For compound combinations, the 
concentration indicated is for each compound; the given combinations 
approximate the hydrolysis products of ICN or 4-OH-ICN. Growth of fungi in 
potato dextrose broth on a microplate was quantified by measuring absorbance 


at 600 nm (OD¢00 nm) 72h after spore inoculation and subtracting the 
absorbance at 0 h; see Methods for further details. Error bars, s.d. based on three 
biological replicates. Note that the half-maximum inhibitory concentrations 
(ICs9) for both camalexin and ICN are approximately 25 uM against B. cinerea 
and 50 uM against A. brassicicola. For 4-OH-ICN, the inhibitory effect is not as 
pronounced, possibly because of rapid degradation of 4-OH-ICN in potato 
dextrose broth (see Supplementary Table 2). 


©2015 Macmillan Publishers Limited. All rights reserved 


a 
Indole-3-carboxylic acid 
500 
5% 400 
oP 
£3 300 
oD 
a2 
5 _€ 200 
Cs 
100 i 
0 _ 
NN MM NN OM NN OM © 
SEOEPSEReeSeRey 
er Ora - Or ea ~K- Or ea 
Bee ad a Sd 
eos eo ¢&s 
Ee Ee eee ee 
Mock A. brassicicola B. cinerea 
Cc 
Camalexin 
3500 
-—~ 3000 
Per 
£¥ 2500 
gs 
e a 2000 
£0 1500 
Of 
O £ 1000 
500 
0 
NN OM NN OM © NN OM 
Seve yea Seaeeces 
er or ea rora -Oor ea 
e&P ese ese 
seSF FFF FF 
ee) EE ee 
Mock A. brassicicola B. cinerea 
e 
Tryptophan 
1000 + 
5 800 | 
oP 
£3 600 | 
oD 
a 
52 400 
SE 
200 
0 S 
NN MO © NN MOM NN OM) 
See eee as ese kee 
Tr Or a@ eT Or a ST Or @ 
ed ea &d ad 
e's. e's fs 


Mock 


Extended Data Figure 10 | Levels of indolic compounds in leaves of mature 
plants after mock treatment or fungal infection. Tissue extracts were 

analysed by LC-MS 7 days post-infection for A. brassicicola FSU218 and 5 days 
post-infection for B. cinerea SF1. a-e, Levels of indicated compound, quantified 


A. brassicicola __B. cinerea 


LETTER 


Indole-3-carboxylic acid methyl ester 


80 
ca 
SS 60 
& 
gs 
Soa 
33 40 
o£ 
Osc 
20 
0 
EXHLSSELHLSERHSY 
Iaet G Iaet G Iaet 
ror @ Se Or @ eT Oor & 
eb ese ese 
p's  g¢°o . es 
[es | ee | oe! | 
Mock A. brassicicola _B. cinerea 
4-OH-Indole-3-carboxylic acid 
200 
cia 
SS 150 
g2 
Soa 
32 100 
Se 
Os 
50 - 
0 
EXHLFELHLSELHSY 
Ia @ Iaet G Iaet G 
rT wor a ere oO rea ror a 
e&e ea Se a Sd 
SPS FFF FG 
Mock A. brassicicola __B. cinerea 
Indole glucosinolates 
1x108 
Olndole 
L 
= 8x10’ @OMe-Indole 1 
= OOMe-Indole 2 
rs 6x10? + 
ig 
= 4107 - 
oO 
ii 2x10? 
0 Ls | |b is 5 5 ke, EE, 
ESSLSFELHSLZELHLY 
Iazt Ia to In xt 
Ke wor a eK Or a K Or a 
ese esd esd 
as @2e . oe 


Mock A. brassicicola B. cinerea 


as EIC integral for the [M + H] * ion (m/z + 10 ppm) and converted to absolute 
amounts by comparison with a standard curve. f, Ion count integrals for 
indole glucosinolates ([M-H] ion, m/z + 10 ppm). Error bars in all panels, s.d. 
based on six biological replicates. 


©2015 Macmillan Publishers Limited. All rights reserved 


sd i ss 


doi:10.1038/nature15248 


Erosion of the chronic myeloid leukaemia 
stem cell pool by PPARy agonists 


Stephane Prost', Francis Relouzat', Marc Spentchian”, Yasmine Ouzegdouh', Joseph Saliba', Gérald Massonnet?, 
Jean-Paul Beressi*, Els Verhoeyen”®, Victoria Raggueneau’, Benjamin Maneglier®, Sylvie Castaigne’, Christine Chomienne’, 
Stany Chrétien!!°*, Philippe Rousselot?:”* & Philippe Leboulchh!)** 


Whether cancer is maintained by a small number of stem cells or is 
composed of proliferating cells with approximate phenotypic equi- 
valency is a central question in cancer biology’. In the stem cell 
hypothesis, relapse after treatment may occur by failure to erad- 
icate cancer stem cells. Chronic myeloid leukaemia (CML) is quint- 
essential to this hypothesis. CML is a myeloproliferative disorder 
that results from dysregulated tyrosine kinase activity of the fusion 
oncoprotein BCR-ABL’. During the chronic phase, this sole genetic 
abnormality (chromosomal translocation Ph*: t(9;22) (q34;q11)) at 
the stem cell level causes increased proliferation of myeloid cells 
without loss of their capacity to differentiate. Without treatment, 
most patients progress to the blast phase when additional oncogenic 
mutations result in a fatal acute leukaemia made of proliferating 
immature cells. Imatinib mesylate and other tyrosine kinase inhibi- 
tors (TKIs) that target the kinase activity of BCR-ABL have 
improved patient survival markedly. However, fewer than 10% of 
patients reach the stage of complete molecular response (CMR), 
defined as the point when BCR-ABL transcripts become undetect- 
able in blood cells’. Failure to reach CMR results from the inability 
of TKIs to eradicate quiescent CML leukaemia stem cells (LSCs)”*. 
Here we show that the residual CML LSC pool can be gradually 
purged by the glitazones, antidiabetic drugs that are agonists of 
peroxisome proliferator-activated receptor-y (PPARy). We found 
that activation of PPARy by the glitazones decreases expression of 
STAT5 and its downstream targets HIF2a° and CITED2°, which are 
key guardians of the quiescence and stemness of CML LSCs. When 
pioglitazone was given temporarily to three CML patients in chronic 
residual disease in spite of continuous treatment with imatinib, all of 
them achieved sustained CMR, up to 4.7 years after withdrawal of 
pioglitazone. This suggests that clinically relevant cancer eradication 
may become a generally attainable goal by combination therapy that 
erodes the cancer stem cell pool. 

Cell division tracking with carboxyfluorescein diacetate-succinimi- 
dyl ester (CFSE) indicates that non-cycling CML cells are poorly sens- 
itive to TKIs”* and that the quiescent TKI-resistant subpopulation is 
enriched in CD34*38° cells’. CML LSCs are hence similar to normal 
quiescent haematopoietic stem cells (HSCs), although they are cyto- 
kine-independent’. Because failure to reach CMR occurs even when 
BCR-ABL remains sensitive to TKIs’, we searched for possible ‘non- 
oncogene addiction (NOA)’ of CMLLSCs as a novel therapeutic target. 
NOA indicates that a given malignant cell is abnormally sensitive to 
quantitative variations in an otherwise normal molecular pathway’®. 

We previously reported that the Nef proteins of the immuno- 
deficiency viruses impair haematopoiesis by activating peroxisome 


proliferator-activated receptor gamma (PPARy)'’. This effect was 
reproduced by the thiazolidinediones, a class of synthetic PPARy ligands 
(Extended Data Fig. 1a), although it is compensated in individuals with 
otherwise normal haematopoiesis’”. We then became intrigued with our 
observation that the CML cell line K562 is particularly sensitive to Nef 
and thiazolidinediones''. The involvement of PPARy was also more 
recently reported in haematopoietic stress response’’. 

We turned to a cohort of 29 chronic phase (CP) CML patients at 
diagnosis whose CD34" cells were >95% Ph*. Combining imatinib 
and pioglitazone showed evidence of synergy with a decrease in the 
number of colony-forming cells (CFC) sixfold more pronounced 
(P <0.0001) than with imatinib alone (Extended Data Fig. 2a). A 
similar trend was observed when normal CD34" cells were transduced 
with a lentiviral vector expressing p210 BCR-ABL (Extended Data 
Fig. 2b). Whereas imatinib alone was unable to reduce significantly 
the frequency of CP-CML long term culture-initiating cells (LTC-ICs) 
(P = 0.067), we found that pioglitazone was able to do so, either as a 
single agent by 2.4-fold (P = 0.008) or with an improved effect by 3.5- 
fold in the presence of imatinib (P < 0.001) (Fig. 1a, b). Similar results 
were obtained with the second generation TKI dasatinib or with 
another thiazolidinedione, rosiglitazone (Extended Data Fig. 2c, d). 

CFSE assays were then performed with CP-CML CD34" cells in the 
absence of cytokines (Fig. 1c-e and Extended Data Table 1). Untreated 
control CP-CML CD34" cells proliferated and differentiated actively. 
Imatinib exposure resulted in the elimination of actively dividing cells 
but also in the accumulation of viable CFSE-bright CD34" cells that 
never divided (‘P’) or had divided only once (Fig. 1d). Pioglitazone 
alone was less effective than imatinib to deplete the bulk of dividing 
CML cells but triggered exit from quiescence (Fig. lc-e and 
Extended Data Table 1). Combining pioglitazone with either imatinib 
or dasatinib acted in synergy to deplete both proliferating and non- 
proliferating cells (Fig. 1c-e, Extended Data Table 1 and Extended 
Data Fig. 2e). Imatinib alone was effective at decreasing the number 
of Ph* CD34* CD38" progenitors but failed to reduce the more 
immature CD34* CD38” population, opposite to pioglitazone alone 
(Extended Data Fig. 3b). 

We then investigated the possible molecular pathways that mediate 
pioglitazone activity against CML LSCs. We previously reported that 
PPARy is a negative transcriptional regulator of STAT5 (A and B)"’. 
STATS5 is known to be critical for maintenance and fitness of both 
normal HSCs" and CML cells, where STATS is activated upon direct 
phosphorylation by the BCR-ABL kinase’*. STAT5 expression levels 
were abnormally high in both total CP-CML CD34” cells and qui- 
escent LSC (Fig. 2a). In CFSE-bright cells (that is, P and 1 division of 


1CEA, Institute of Emerging Diseases and Innovative Therapies (iMETI), F-92265 Fontenay-aux-Roses, France. Département de biologie médicale, Hépital Mignot, F-78150 Le Chesnay, France. *Unité de 
Biologie Cellulaire, UMR-S-940 Institut Universitaire d’Hématologie, Hdpital Saint Louis, F-75010 Paris, France. “Service d’Endocrinologie et de Diabétologie, Hopital Mignot, F-78150 Le Chesnay, France. 
SCIRI, International Center for Infectiology Research, EVIR team, Inserm, U1111, CNRS, UMR5308, Université de Lyon-1, ENS de Lyon, 69007 Lyon, France. ®*Inserm, U895, Centre de Médecine 
Moléculaire (C3M), équipe 3, 06204 Nice, France. ’Laboratoire d’hématologie, Centre Hospitalier de Versailles, F-78150 Le Chesnay, France. °Unité de Pharmacologie, Service de Biologie Médicale, Centre 
Hospitalier de Versailles, F-78150 Le Chesnay, France. °Service d’Hématologie et d’Oncologie, Hopital Mignot, Université Versailles Saint-Quentin-en-Yvelines, F-78150 Le Chesnay, France. !°Inserm, 
Institute of Emerging Diseases and Innovative Therapies (iMETI), F-92265 Fontenay-aux-Roses, France. ‘tGenetics Division, Brigham & Women’s Hospital and Harvard Medical School, Boston, 
Massachussetts 02115, USA. !*Hematology Division, Ramathibodi Hospital and Mahidol University, 10400 Bangkok, Thailand. 


*These authors contributed equally to this work. 


380 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


a LTC-IC (LDA) b 
LTC-IC 
No. of PhtCD34* seeded per well : 
1002 goo 109 180 200 250 300 Frequencies a 
80 oe Untreated 1/303 (1/244 to 1/376) 
g = 
& 60 —y Imatinib 1/495 (1/386 to 1/635) 
o 
raf 40 Pio 1/715 (1/537 to 1/950) 
= 
o Imatinib + Pio 1/1,052 = (1/753 to 1/2,026) 
2 
g 20 ; one, Cl, 95% confidence interval 
3 v Pio 
v Imatinib + Pio 
10 
c d CD34* cells distribution at 14 days (%) 
Untreated Cell division numbers 
8 7 6 5 4 3 2 1 
o Untreated i?) 0 03 43 23 33 25.6 9.8 
2 Imatinib 0) 0 0.7 62 13 26 20 23 
3 Pio 01 25 16 27 25 1679 3 
. 
ro] er 
Imatinib Imatinib 
g matini + Pio 0 O 0.3 5.23 24 33 26 65 
ne] 
= e 
8 , A (« 103) CD34* 
3 80 mm cDs4- S 
is) Qa 
4 i 600 = 
z Pio 2 a a 
[a] io Q 
o) co 60 500 2 
% N y + i?) 
3 is 50. g 
Qa : oo 
E a 40 ee 
2 3 9 
® 
Imatinib 2790 s00' 
+ Pio oO 100 
st 20 2 
a af 
O 410 50 & 
Q 
o! to 2 
D3 D7 D14 D3 D7 D14 D3 D7 D14 D3 D7 D14 G 


40 80 120 160 200 


Imatinib 
+ Pio 


—— Untreated Imatinib Pio 
Number of mitosis (CFSE) 


Figure 1 | Pioglitazone purges quiescent CML stem cells. a, Limited dilution 
analysis (LDA) of CML LSCs by LTC-IC assay. Pio, pioglitazone. b, LTC-IC 
frequencies calculated from a (n = 4). c, CFSE analysis (patient 4) after liquid 
culture in serum-free medium without cytokines. P (red), colcemid arrested 
‘parent-cells’. d, Distribution (%) of CD34" cells in each mitosis peak shown in 
c. e, Identical culture conditions as in c, but for patient 2. Left scale (black) 
and histograms show cell counts. Right scale (red) and red dots and lines show 
the number of undivided CD34* cells (P in CFSE assay). Also see Extended 
Data Table 1 (m = 6). See statistics in Methods. 


CP-CML CD34* cells) purified at 14 days of culture without 
cytokines, STAT5B messenger RNA levels decreased by 8.5-fold 
(P <0.0001), 1.5-fold (P= 0.08) and 10.5-fold (P< 0.0001) in the 
presence of pioglitazone, imatinib and the drug combination, respect- 
ively (Fig. 2a). Similar values were obtained for STAT5A (not shown). 

We then compared mRNA levels of four known STATS targets 
genes. Addition of pioglitazone to imatinib significantly reduced 
expression of BCL2L1 (also known as BCL-X,)'* (3.3-fold), BCL2'° 
(4.8-fold), PIM1'’ (1.6-fold) and CISH (also known as CIS)'® (1.6-fold), 
thus suggesting that imatinib alone is not able to inhibit STATS 
transcriptional activity to completion (Fig. 2b). Supplementary studies 
with the bromodomain inhibitor JQ1” confirmed the pivotal role 
played by STATS in CML LSCs (Supplementary Data and Extended 
Data Fig. 2f). 

The effect of pioglitazone was negated by a short interfering RNA 
against PPARy (also known as PPARG) mRNA (Fig. 2c and control 
Extended Data Fig. 1b, c). Decreased clonogenicity of CP-CML 
CD34 cells in the presence of pioglitazone was abolished when 
STAT5B was overexpressed after lentiviral transfer (Fig. 2d). We 
observed that imatinib acts rapidly (in minutes) by preventing 
STATS phosphorylation, whereas pioglitazone acts slowly (in days) 
by decreasing STATS protein levels (Extended Data Fig. 3a, c and d), 
owing to the known long half-life of STATS protein in spite of the 
rapid decrease in its mRNA levels’*. Pioglitazone activity in CFSE 
assays with CP-CML cells was abrogated when STATS5B was over- 
expressed by lentiviral transfer (Fig. 2e, Extended Data Fig. 4a-e and 
Extended Data Fig. 1c). Importantly, pioglitazone was found to be 
more inhibitory/cytotoxic for CML LTC-IC than for normal LTC- 
IC (Extended Data Fig. 5). 


LETTER 


a b i c 
ifi 120 
100 Purified I 
CFSE-bright Le 7 
ie £5 [= Imatinib 100 
Sz 00 a2 Zs 
Ce 58 = Imatinib + Pio 2% 20 
mm = 60 a5 zs 
ee go 2 m= 60 
<2 40 SH * ee 
S <x P 
aS P<0001 =F 40 
£31 aE 
20 - EE * re r 36 
tl ; 
Pio - + + ee) + - + 
Imatinb + - + BCLx BCL2 PIM1 CIS siPPARy - + + 
d e103) 
250 30) [[] cose 250 
a a 
; Hi cps . 
g 200 & 
200 E a 
a o 20 9 
8 = 150 
e 150 IP = 0.0047 i e 
3 o 2 
= a 8 
© 100 ° too & 
5 a) o 
a ed 
oo (S) 50 6 
Qa 
3 
0 0! 
Pio - ee ie Pio = 2 
LvSTAT5 - + - LvSTAT5 = = 
LvGFP + - + LvGFP + + = 


Figure 2 | Pioglitazone targets the PPARy-STAT5 pathway in CML LSCs. 
a, Normalized STAT5B quantitative PCR with reverse transcription 
(RT-qPCR) on CFSE-bright cells (that is, P and 1 division) at 14 days of culture. 
b, Percent mRNA expression of STATS target genes in CFSE bright cells 
after drug exposure. c, CP-CML CD34" cells cultured with an anti-PPARy 
siRNA before RT-qPCR. d, Colony-forming cell (CFC) assays with CP-CML 
CD34* cells after transduction with enhanced green fluorescent protein- 
(eGFP; negative control) or a STAT5B-expressing lentivectors (Lv). e, Absolute 
cell count together with CFSE analysis (patient 2 in triplicate). Red dots, 
undivided CD34" cells (P in CFSE assay). Data show means + s.d., n = 5. See 
statistics in Methods. 


We then examined, in 7-day cultures without cytokines of CP-CML 
CD34" cells from 11 patients, mRNA expression levels for 9 putative 
downstream transcriptional targets of STATS and/or PPARy (Fig. 3a). 
These included OCTI (also known as POU2F1)°, PML, SIRT1, 
ALOX5, STAT3, MDR1 (also known as ABCB1), GLUT1 (also known 
as SLC2A1), B-catenin (also known as CTNNB1) and HIF2a (also 
known as EPAS1)'’. CD36, known to be upregulated by PPARy ago- 
nists, was used as a positive control’’. Only OCT1 and HIF 2a express- 
ion levels were significantly altered after culture in the presence of 
pioglitazone + imatinib versus imatinib alone (Fig. 3a). Although 
upregulation of OCT1 expression may increase the cellular uptake of 
imatinib”, we found that erosion of the CP-CML LSC pool was not 
improved in the presence of imatinib alone when OCT1 was over- 
expressed by lentiviral transfer (Extended Data Fig. 6). 

In contrast to OCT1, HIF2x was downregulated by pioglitazone 
(Fig. 3a). Importantly, HIF2« and to a lesser degree HIF1« were found 
upregulated in imatinib-resistant CFSE-bright cells (P and 1 cell divi- 
sion), while pioglitazone counteracted this phenomenon (Fig. 3b). 
CITED2, a key gene of HSC “stemness”*, known to be upregulated 
by HIF2a° followed the same trend (Fig. 3b). The viability of undivided 
and imatinib-resistant CP-CML CD34 cells required HIF2« express- 
ion (Fig. 3c, Extended Data Fig. 7a—e and Extended Data Fig. 1c), and 
forced expression of HIF2o in normal human cord blood CD34" cells 
increased the compartment of quiescent cells (Fig. 3d, Extended Data 
Fig. 7f-i). Furthermore, forced expression of BCR-ABL in normal 
human CD34* cells induced HIF2« in a STAT5-dependent manner 
(Extended Data Fig. 8a). Similarly, forced expression of constitutively 
active mutants of STAT5A or STAT5B also induced HIF 2a expression 
(Extended Data Fig. 8b). In both cases, induction of HIF2« was assoc- 
iated with upregulated expression of its target gene, CITED2 (Extended 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 381 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 40, P=0.016 b Purified CFSE-bright cells 
8 P= 0.049 my Imatinib = Hibte 
= at 

6 Imatinib + Pio 6 Ga CITED2 


is 


: E itl i i i eLIL| 


: DD 
P=0.018 
0 


MDR1 B-CAT 


(relative to untreated) 


mRNA fold amplification 
nD 


mRNA fold amplification 
(relative to imatinib alone) 
no 


alt 


CD36 PML1 | ALOXS5 imatinib 
OCT1 SIRT1  STAT3 GLUT! ~—-HIF2a Pio a 2 
c d cD34+ 
- +400 
(« 10°) }- 400 (109) 7 fj cps4 
& CD34* 
15 | c 
Mcose | os 30 Lg 
® a ‘fi S 
g = = = 
se 4 = 3 a 
x ge L300 & 
N +300 BON 4g e = 
% 10 Q 8 8 
2 oS x2) g 
oO L BR ay - rd 
8 a 8 ¢ 
m | B fm 1° @ 
Pal loo 2 A +200 9 
oO 54 e 2 Oo = 
N 
ee [ a 5 L a 
2 < 
% ry 
il | [| + 100 100 
siRNA Ctrl - = + ne 4 LvGFP + - 
siRNA HIF2a  — - + - - ee = 
siRNA STAT5 — +o - Ges eae = LvHIF2a - # 
Imatinib = - - + + ne a - 
Pio - a a = = St ab + 


Figure 3 | Expression of target genes in CP-CML cells exposed to 
pioglitazone and imatinib. a, RT-qPCR assays on CP-CML CD34" cells after 
7 days without serum or cytokines. Data show means +s.d., n = 11. b, RT- 
qPCR assays in purified 12-14 days CFSE-bright cells (that is, P and 1 division). 
Data show means +s.d., n = 6. c, Absolute cell count together with CFSE 
analysis of representative CP-CML patient 8 (triplicate). Undivided cells (red 
dots). Also see Extended Data Fig. 7 and Extended Data Fig. 1c. d, Same as in 
c, but after transduction of cord blood CD34" cells with lentivectors (Lv) 
(triplicate). Undivided cells (red dots). See statistics in Methods. 


Data Fig. 8a, b). Because CITED2 is a known master gene of HSC 
quiescence that regulates stemness-associated genes such as BMI1”, 
HES1* and p57 (also known as CDKN1C)”’, we studied the expression 
of these genes in CD34" cells from CP-CML patients and in murine Ba/ 
F3 cell lines we generated to express, by means of retroviral vector 
transduction and ubiquitous promoter driven expression, the consti- 
tutively active forms of murine Stat5a or Stat5b 1*6 (H299R, S711F). 
After 10 days of culture in the presence of imatinib, TKI-resistant 
CD34" cells from CP-CML patients showed an increase in endogenous 
expression of both CITED2 itself (4.5-fold) and the known CITED2 
target genes BMI1 (2.8-fold), HES1 (3.1-fold) and p57 (16.5-fold). 
Addition of pioglitazone fully counteracted said increase in CITED2, 
BMI] and HES] expression and reduced the increase in p57 expression 
by fourfold (Extended Data Fig. 9a). Ba/F3 cell studies corroborated this 
evidence (Extended Data Fig. 9b and c). Taken together, we propose 
here that the CML-LSC is critically dependent (NOA) on a PPARy- 
STAT5-HIF2a-CITED2 pathway, directly and effectively inhibited by 
pioglitazone (Fig. 4a), thus extending the contention that equivalent 
murine leukaemias are addicted (NOA) to STAT5 (ref. 15). 

Because mouse models are poorly suited to investigate CML LSCs* 
and pioglitazone is an approved drug for the treatment of diabetes 
mellitus type 2 in humans, we initially sought to validate pioglitazone 
directly on two patients diagnosed with both diabetes and CML who 
never reached CMR in spite of long-term imatinib treatment. Before 
filing a formal clinical trial application, we prescribed pioglitazone off- 
label and under approved informed consent to a third CML patient, 
this time non-diabetic, who never reached CMR either under long- 
term imatinib therapy (Fig. 4b). 

Pioglitazone was added to the treatment after 5, 6 and 4 years of 
uninterrupted imatinib therapy for patients 1, 2 and 3, respectively. 
None of the 3 patients ever reached CMR before introduction of 


382 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


& * CML ‘bulk’ CML LSC 
< 
or) 
Imatinib —+ @efelsa =m 2 [| S = a 
2 a s+ 
£ > 
a 1 14P 11+P 
CML LSC 
& w= Normoxia 
“1 . ca 
Medullary HIF20, | HIF20. 
vascular y y 
sinus @ CITED 2 PPARY CITED2 
ued el eect ee x 
_ | Self renewal and/or 1) __________, 
' Quiescence | | differentiation 1 1 Quiescence 
Wee in niche: ¢ 


b Pioglitazone 
Imatinib —_—___ 
Einetihio ae 
Imatinib 
> 
0.1% 4 (Stop imatinib 
oa 6 months) 
= 
Zz 
a 
fs 
an 
2 
= 
a 
+ 
G 0.01% 
a 
—e— Patient 1 
—e— Patient 2 
—v— Patient 3 
CMR4.5 > 


-30 -20 -10 0 10 20 30 40 50 60 
Months 


Figure 4 | Pioglitazone induces complete and sustained molecular response 
(CMR) in CML patients. a, Model of CML LSC addiction to the PPARy- 
STAT5-HIF2« pathway. (Top insert) for the bulk of dividing CML cells, 
imatinib (1) alone is able to bring phopho-STATS levels below a threshold 
(dotted line) at which apoptosis occurs (cross). For CML LSCs, only the 
combination of imatinib and pioglitazone (I+ P) is able to bring cells below a 
threshold at which cells leave their state of quiescence before undergoing 
apoptosis. b, RT-qPCR assays for BCR-ABL/ABL on nucleated blood cells from 
the first three patients. See Supplementary Information for details. 


pioglitazone. Pioglitazone was added to the treatment of patient 1 
during two brief exposures of 10 and 8 months each with an interval 
of 28 months (Fig. 4b). CMR was achieved 10 months after initial 
pioglitazone addition, and patient 1 has remained in CMR for at least 
56 months, the last time-point collected for this study, which is 53 
months (4.5 years) after first stopping pioglitazone administration 
(Fig. 4b). For patient 2, CMR was obtained after 1 year of pioglitazone 
addition and maintained for 32 months at which time they withdrew 
(Fig. 4b). For patient 3, CMR was achieved after 6 months of piogli- 
tazone addition. At this time point, the level of STAT5 mRNA in 
CD34™ cells from the bone marrow of patient 3 was decreased by 
11.9 fold. Patient 3 has remained in CMR for at least 38 months, the 
last time-point collected for this study, which is 28 months after stop- 
ping pioglitazone administration (Fig. 4b). Furthermore, patient 3 
decided to stop imatinib for the last 6 months of the aforementioned 
observation period and has remained in CMR during this period with- 
out any treatment (Fig. 4b and Supplementary Data). 

Regulatory approval was then obtained for multi-centre Phase II 
clinical trials, and the first (EudraCT 2009-011675-79) aimed at 
assessing the short-term cumulative incidence of CMR conversion 


©2015 Macmillan Publishers Limited. All rights reserved 


for patients who never reached CMR under imatinib alone (https:// 
www.clinicaltrialsregister.eu/ctr-search/trial/2009-011675-79/FR#E). 
Scoring by quantitative PCR was performed over the course of the first 
12 months after trial initiation during concurrent and brief exposure 
(3 to 12 months) to the imatinib-pioglitazone combination”. Out of 
24 assessable patients, the cumulative incidence rate in the treated 
group reached 57% versus 27% (P = 0.02) for an historical group of 
patients having received imatinib alone, thus indicating that clinical 
evidence of efficacy can already be detected even after very brief 
treatment and early analysis. Post-trial follow-up confirmed stability 
of CMR status in data collected to date. Therapy with pioglitazone was 
accompanied by a stable reduction of STAT5 mRNA in patient sam- 
ples as early as month 6 (2.3-fold, P = 0.0003) and by a reduction of 
the clonogenic potential of bone marrow CD34* cells (1.54-fold, 
P = 0.0003). 

Although both imatinib and pioglitazone decrease STATS activity, 
they act by different mechanisms. Imatinib inhibits STAT5 activa- 
tion by BCR-ABL phosphorylation, whereas pioglitazone decreases 
STAT5 expression. It seems that imatinib alone is sufficient to induce 
effective clearance of the bulk of more differentiated CML cells, but 
fails to bring STAT5 activity below a threshold for CML LSC to exit 
from quiescence and to undergo subsequent apoptosis. Pioglitazone is 
effective at doing so in synergy with imatinib (Fig. 4a, top insert). As in 
this example with CML, progressive erosion of cancer stem cell pools 
may prove ultimately achievable pharmacologically, bringing hope of 
obtaining cancer eradication in a variety of human malignancies by 
combination therapy. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 17 March 2014; accepted 28 July 2015. 
Published online 2 September 2015. 


1. Nguyen, L. V., Vanner, R., Dirks, P. & Eaves, C. J. Cancer stem cells: an evolving 
concept. Nature Rev. Cancer 12, 133-143 (2012). 

2. Chomel, J.C. & Turhan, A. G. Chronic myeloid leukemia stem cells in the era of 
targeted therapies: resistance, persistence and long-term dormancy. Oncotarget 
2, 713-727 (2011). 

3. de Lavallade, H. et al. Imatinib for newly diagnosed patients with chronic myeloid 
leukemia: incidence of sustained responses in an intention-to-treat analysis. 

J. Clin. Oncol. 26, 3358-3363 (2008). 

4. Corbin, A. S. et al. Human chronic myeloid leukemia stem cells are insensitive to 
imatinib despite inhibition of BCR-ABL activity. J. Clin. Invest. 121, 396-409 (2011). 

5. Hu, C. J., Sataur, A. Wang, L., Chen, H. & Simon, M. C. The N-terminal 
transactivation domain confers target gene specificity of hypoxia-inducible factors 
HIF-1a and HIF-20. Mol. Biol. Cell 18, 4528-4542 (2007). 

6. Du,J.& Yang, Y. C. Cited2 in hematopoietic stem cell function. Curr. Opin. Hematol. 
20, 301-307 (2013). 

7. Graham, S. M. et al. Primitive, quiescent, Philadelphia-positive stem cells from 
patients with chronic myeloid leukemia are insensitive to ST|57 1 in vitro. Blood 99, 
319-325 (2002). 

8. Jgrgensen, H. G., Allan, E. K., Jordanides, N. E., Mountford, J. C. & Holyoake, T. L. 
Nilotinib exerts equipotent antiproliferative effects to imatinib and does not 
induce apoptosis in CD34* CML cells. Blood 109, 4016-4019 (2007). 


LETTER 


9. Copland, M. et al. Dasatinib (BMS-354825) targets an earlier progenitor 
population than imatinib in primary CML but does not eliminate the quiescent 
fraction. Blood 107, 4532-4539 (2006). 

10. Luo, J., Solimini, N. L. & Elledge, S. J. Principles of cancer therapy: oncogene and 
non-oncogene addiction. Cel! 136, 823-837 (2009). 

11. Prost, S. et a, Human and simian immunodeficiency viruses deregulate early 
hematopoiesis through a Nef/PPARy/STAT5 signaling pathway in macaques. 

J. Clin. Invest 118, 1765-1775 (2008). 

12. Berria, R. et al. Reduction in hematocrit and hemoglobin following pioglitazone 
treatment is not hemodilutional in Type Il diabetes mellitus. Clin. Pharmacol. Ther. 
82, 275-281 (2007). 

13. Avagyan, S., Aguilo, F., Kamezaki, K. & Snoeck, H. W. Quantitative trait mapping 
reveals a regulatory axis involving peroxisome proliferator-activated receptors, 
PRDM16, transforming growth factor-B2 and FLT3 in hematopoiesis. Blood 118, 
6078-6086 (2011). 

14. Wang, Z,, Li, G., Tse, W. & Bunting, K. D. Conditional deletion of STAT5 in adult 
mouse hematopoietic stem cells causes loss of quiescence and permits efficient 
nonablative stem cell replacement. Blood 113, 4856-4865 (2009). 

15. Hoelbl, A. et a/. Stat5 is indispensable for the maintenance of bcr/abl-positive 
leukaemia. EMBO Mol. Med. 2, 98-110 (2010). 

16. Kieslinger, M. et al. Antiapoptotic activity of Stat5 required during terminal stages 
of myeloid differentiation. Genes Dev. 14, 232-244 (2000). 

17. Fatrai,S., Wierenga, A. T., Daenen, S. M., Vellenga, E. & Schuringa, J. J. Identification 
of HIF2« as an important STAT5 target gene in human hematopoietic stem cells. 
Blood 117, 3320-3330 (2011). 

18. Matsumoto, A. et al. CIS, a cytokine inducible SH2 protein, is a target of the JAK- 

STAT5 pathway and modulates STAT5 activation. Blood 89, 3148-3154 (1997). 

19. Liu, S. etal. Targeting STAT5 in hematologic malignancies through inhibition of the 

bromodomain and extra-terminal (BET) bromodomain protein BRD2. Mol. Cancer 

Ther. 13, 1194-1205 (2014). 

20. Wang, L., Giannoudis, A., Austin, G. & Clark, R. E. Peroxisome proliferator-activated 

receptor activation increases imatinib uptake and killing of chronic myeloid 

eukemia cells. Exp. Hematol. 40, 811-819 (2012). 

21. Szanto, A. & Nagy, L. Retinoids potentiate peroxisome proliferator-activated 

receptor gamma action in differentiation, gene expression, and lipid metabolic 

processes in developing myeloid cells. Mol. Pharmacol. 67, 1935-1943 (2005). 

22. Chen, Y., Haviernik, P., Bunting, K. D. & Yang, Y. C. Cited2 is required for normal 

hematopoiesis in the murine fetal liver. Blood 110, 2889-2898 (2007). 

23. Du, J. et al. HIF-1« deletion partially rescues defects of hematopoietic stem cell 
quiescence caused by Cited2 deficiency. Blood 119, 2789-2798 (2012). 

24. Koschmieder, S. & Schemionek, M. Mouse models as tools to understand and 
study BCR-ABL1 diseases. Am. J. Blood Res. 1, 65-75 (2011). 

25. Rousselot, P. et al. Targeting STAT5 expression resulted in molecular response 
improvement in patients with chronic phase CML treated with imatinib. ASH 
Annual Meeting Abstracts, (2012). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank C. Costa, V. Tran Chau, F. Goullieux, A. Krief, P. Raynal, 
C. Terré, S. Tabore and T. Andrieu for their experimental contributions. This work was 
supported by the Association Laurette Fugain, Paris, France, by the Association pour la 
Recherche sur le Cancer, Villejuif, France to S.P., P.R. and P.L. and by the Chaire 
industrielle de l'Agence Nationale pour la Recherche (ANR) to P.L. 


Author Contributions S.P. lead the project, designed and performed experiments, and 
analysed data. P.L. and S.Ch. designed experiments and analysed data. P.L. wrote the 
paper. F.R., M.S,, Y.0., J.S., E-V., V.R., B.M. and G.M. contributed experimentally. P.R., 
J.-P.B.,C.C. and S.Ca. contributed clinically. P.R.,S.Ch. and P.L. have contributed equally 
to this work; P.R. ina clinical capacity, S.C. and P.L. in a scientific capacity. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to S.P. (stephane.prost@cea.fr) and 
P.L. (pleboulch@rics.bwh.harvard.edu). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 383 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


Reagents. For in vitro assays, PPARy agonists were provided by Cayman 
Chemical (PPARy-PAK; Bertin-pharma). Imatinib mesylate was provided by 
Novartis and was used at 1 1M in culture, a well-established inhibitory concen- 
tration in vitro that also approaches the achievable drug level in patients’ plasma. 
Dasatinib and JQ1 were provided by Bristol-Myers Squibb and Sigma and were 
used in culture at 0.146 1M and 1 uM, respectively. Murine pro-B cell line Ba/F3 
and human chronic myelogenous leukaemia cell line K562 were provided by the 
American Type Culture Collection (ATCC; Ref. CRL-12015 and CCL-243, 
respectively). These cell lines were tested for mycoplasma contamination every 
3 months using Venor Gem Advance Pre-aliquoted Mycoplasma Detection Kit 
(Minerva biolabs). 

Cell culture and proliferation assays. CD34" cells from patients in CP-CML at 
diagnosis or umbilical cord blood were immunoselected (CD34 microBead Kit, 
Miltenyi Biotec) according to the manufacturer’s instructions. Enrichment for 
CD34" cells was ascertained by flow cytometry using an anti-CD34 monoclonal 
antibody (clone 581; BD Pharmingen). Ph1 *_CD34°* cells were cultured in serum 
free medium (SFM) StemSpan (StemCell Technologies) without growth factors. 

Colony forming cell (CFC) and long term culture-initiating cell (LTC-IC) 
assays. For CFC assays, CD34" cells were suspended (1 X 10*) in 3 ml of alpha- 
MEM based methylcellulose medium (GF H4434, StemCell Technologies). Cells 
were scored and collected after 14 days incubation at 37°C and 5% CO. After 
scoring, colonies were washed with PBS and kept frozen in RNAlater (Invitrogen) 
for subsequent analysis. LTC-IC with limiting dilution assays (LDA) were per- 
formed in StemSpan SFM (Stemcell technologies) on irradiated MS5 monolayers 
at several dilutions of CD34* cells (300, 150, 75, or 37 cells per well for Phi* 
CD34" cells and 200, 100, 50, or 25 cells per well for CD34" from healthy donors) 
in 96-well plates with 16 replicate wells per concentration. After five weeks with 
weekly change of one half medium volume, all cells were transferred in alpha- MEM 
based methylcellulose medium (GF H4434, Stemcell technologies) to determine the 
total clonogenic cell content of each LTC. LTC-IC frequencies were determined 
using the L-Calc software (Stemcell technologies). 

Flow cytometry. The following antibodies were used: fluorescein isothiocyanate 
(FITC)-conjugated IgG1 (clone 679.1Mc7, Beckman Coulter), Alexa Fluor 488- 
conjugated-IgG1 (clone MOPC-21, BD Pharmingen), allophycocyanin (APC)- 
IgG1 (clone MOPC-21, BD Pharmingen), peridinin chlorophyll protein-cyanin 
5.5 (PerCP-Cy5.5)-conjugated IgG1 (clone X40, BD Pharmingen), phycoerythrin 
cyanin (PE-Cy7)-conjugated IgG1 (clone MOPC-21, BD Pharmingen), (PerCP- 
Cy5.5)-conjugate CD45 (clone 2D1, BD Pharmingen), (APC)-conjugated CD34 
(clone 581, BD Pharmingen), (PE-Cy7)-conjugated CD38 (clone HB7, BD 
Pharmingen), Alexa Fluor 488-conjugated anti-STAT5 (pY694) (clone 47, BD 
Pharmingen), (PE)-conjugated anti-GLUT1 (FAB1418P, R&D systems). For all 
experiments, cell viability was assessed using SYTOX Blue dead cell stain 
(Invitrogen Life Technologies). 

Intracellular STATS phosphorylation assays. In brief, 3 X 10° K562 cells per ml 
cultured in complete Dulbecco’s modified Eagle medium supplemented with 10% 
fetal calf serum (PAA) alone and with or without pioglitazone (10 1M) or imatinib 
(1 pM) at 37 °C in 5% CO, were harvested at variable time as indicated. Cells were 
fixed and permeabilized using Cytofix/Cytoperm kit (BD Pharmingen) and 
stained with Alexa Fluor 488-anti-phospho-STAT5 monoclonal antibody (BD 
Phosflow) or Alexa Fluor 488-isotype-matched control to obtain fluorescence 
minus comparative in each experiment. Analysis was carried on a minimal num- 
ber of 50,000 events in the viable cell gate. The delta mean fluorescence intensity of 
p-STATS5 after drug treatment (p-STAT5AMFI) was determined as follow: 
(untreated cells p-STATS MFI - non-treated cells isotype-control MFI) — (drug 
treated cells p-STATS MFI — drug treated cells isotype-control MFI). 

CSFE assays. Fresh CD34" -enriched cells were stained with 2 UM of 5- (and 6-) 
carboxyfluorescein diacetate succinimidyl diester (CFSE, Invitrogen). Cells were 
then cultured (seeded 5.10° per ml) in SFM StemSpan (StemCell Technologies) 
without growth factors and with or without pioglitazone (10 1M) or imatinib 
(1 uM). Cells cultured in the presence of Colcemid (100 ngml', Invitrogen 
Life Technologies) were used to establish the range of fluorescence exhibited by 
cells that had not divided during post-labelling incubation. Cells were harvested at 
variable time points as indicated, collected in BD Trucount tubes for absolute 
count (BD Biosciences) and labelled with anti-CD45 and anti-CD34. Then, cells 
were diluted in 1ml of phosphate-buffered saline (PBS, Invitrogen Life 
Technologies) containing 2% fetal calf serum (PAA) and stained for viability. 
All analyses were carried out on a BD FACS Canto2 Flow Cytometer. 

DNA synthesis assay. Cell proliferation rate was measured by incorporation of 
5-ethynyl-2'-deoxyuridine (EdU), a thymidine nucleoside analogue, in DNA dur- 
ing active DNA synthesis (two hours). Staining was performed according to the 
manufacturer’s protocol (Click-iT EdU Flow Cytometry Assay Kit, Invitrogen). 
All analyses were carried out on a BD FACS Canto2 Flow Cytometer. 


RNA extraction and RT-qPCR analysis. RNA was extracted from 2 X 10° cells 
using RNAqueous-4PCR (Ambion). Reverse transcription was carried out for 1 h 
at 42 °C using SuperScript Vilo cDNA Synthesis kit (Invitrogen Life Technologies) 
according to the manufacturer’s instructions. Real-time PCR was performed in an 
iCycler thermocycler (CFX, Bio-Rad). The primers and probes sequences for 
GAPDH, STAT5B, STATS5A, BCR-ABL, ABL, HIF1a, CITED2, OCT1”, MDRI1”, 
SIRTI’’, STAT3**, ALOX5”, GLUT”, B-catenin”’, PML”, HIF2a"’, Bcl-X;, Bcl-2, 
PIM-1, CIS’, BMI1”, HES-1”, p57 (CDKNIC)”*, CD36"" (known to be upregu- 
lated by PPARy agonists, was used as a positive control) are reported in 
Supplementary Table 1. The primer pairs used with TaqMan Gene Expression 
Master mix (Applied Biosystems) and iQ Supermix SYBR GRN (Bio-Rad) are 
listed in Supplementary Table 1a and Supplementary Table 1b, respectively. The 
comparative Cy; method (AAC;) was used to compare gene expression levels 
between the different culture conditions (relative to GAPDH). 

BCR-ABL/ABL quantification. qPCR experiments were performed on cDNA 
using the 7000 Sequence Detection System (Applied Biosystems). The BCR-ABL/ 
ABL ratio was determined using FusionQuant standards (Ipsogen) according to 
the Europe Against Cancer protocol*’. CMR is defined as undetectable minimal 
residual disease (negative BCR-ABL transcripts) while showing a sensitivity level 
of at least 40,000 amplified copies of the ABL control gene, that is to say more than 
4.5 log reduction by standardized International Scale (IS) RT-qPCR (that is, BCR- 
ABL/ABL* mRNA ratio < 0.0025%); relapse from CMR is to be declared when at 
least 2 consecutive positives occur 6 months apart. These criteria are consistent 
with the level of sensitivity routinely applied within laboratories participating in 
the French GBMHM Network (Group of Molecular Biologists for Hematological 
Malignancies). 

Interphase FISH probe assay. Fluorescent in situ hybridization (FISH) was per- 
formed on interphase nuclei, following standard procedures and using specific 
probe for the t(9;22) (MetaSystems, Germany). The probe is designed as a dual- 
fusion assay. The red labelled probe detects an extended region at the ABL1 locus 
on 9q34 and a green labelled probe hybridizes specifically to regions at the BCR 
gene on 22q11. Preparations were counterstained with 4,6-diamidino-phenyl- 
indole (DAPI) and a minimum of 50 interphase nuclei were examined. Results 
were recorded using a fluorescence microscope (Nikon) fitted with appropriate 
filters, and digital-imaging software Lucky (CaryoSystems, France). 

Western blot analysis. For STATS protein analysis, K562 cells (2.5 X 10°) were 
lysed in RIPA lysis buffer on ice. Whole-cell extracts were boiled for 5 min in 
Laemmli sample buffer and subjected to SDS-PAGE in 4-12% acrylamide gels 
(Nupage, Invitrogen Life Technologies). Proteins were transferred to Hybond N+ 
filters (Amersham). Membranes were probed with the following antibodies: 
STATS (sc-1656), ACTIN (sc-8432), PPARy (H-100:sc-7196), goat anti-mouse 
IgG-HRP (sc-2005) (Santa Cruz Biotechnology Inc.) and Anti-HIF2« (SMC- 
185C/D). Antibody binding was detected by the enhanced chemiluminescence 
ECL+ (Amersham). 

Lentiviral vector production and transduction. STAT5B lentiviral vector. The 
cDNA encoding STAT5B was cloned, sequenced (GenBank accession number 
DQ267926), and inserted into the SIN-cPPT-PGK-WHV lentiviral transfer vector 
as previously described**. A SIN-cPPT-PGK-eGFP-WHV lentiviral vector was 
used for control. 

OCT-1 and HIF2qa lentiviral vectors. OCT-1 (SLC22A1, accession number 
BC126364) and HIF2a (EPAS1, accession number BC051338) lentiviral vectors 
were provided by Applied Biological Materials, Inc. (catalogue nos LV309003 and 
LV149063). 

Constitutively active murine Stat5a and Stat5a. Stat5a(1*6)(H299R, S711F) and 
Stat5b(1*6)(H299R, S711F) retroviral vectors were provided by Cell Biolabs, Inc. 
(catalogue nos RTV-333 and RTV-335). 

shRNA lentiviral vector anti-PPAR-y. The PPAR-y mRNA pairing sequence 5’ - 
TGTTCCGTGACAATCTGTC-3’ (GenBank accession number L40904) was 
designed and synthesized as follows within an shRNA structure comprising 
unique restriction sites at each end: sense 5'-GATCTCCTGTTCCGTGACA 
ATCTGTCTTCAAGAGAACAGATTGTCACGGAACATTTTTGGAAGAATT 
CC-3’; antisense 5'-CTGAGGAATTCTTCCAAAAATGTTCCGTGACAAT 
CTGTAAGTTCTCTACAGATTGTCACGGAACAGGA-3’. Oligonucleotides 
were annealed and ligated into BglII and Xhol sites of linearized pSuper plasmid. 
PollII H1 promoter-shRNA PPAR-y was then subcloned in the pTRIP lentiviral 
vector. Vectors were produced as previously described**. 

BCR-ABL lentiviral vector. Total RNA from K562 cells was extracted using 
TRizol (Invitrogen Life Technologies). Reverse transcription was carried out for 
lhat 50°C using SuperScript III (Invitrogen Life Technologies). Two independent 
PCR were performed using BCR-ABL F 1, 5’- ATGGTGGACCCGGTGGGCTT-3’ 
with BCR-ABL R 2831, 5'-CTGCTACCTCTGCACTATGTCACTG-3’ and BCR- 
ABL F 2685, 5'-TCCGCTGACCATCAATAAGGA-3' with BCR-ABL R 6097, 5'- 
CTGCTACCTCTGCACTATGTCACTG-3’ respectively. Specific amplification 


©2015 Macmillan Publishers Limited. All rights reserved 


bands were pooled, heated to 95 °C during 3 min and ramp cooled to 25 °C over a 
period of 45 min. Annealing product was submitted to a third PCR with LA Taq 
DNA polymerase (Takara) using the following primer pair: BCR-ABL F 1 ascl: 5’- 
AGGCGCGCCATGGTGGACCCGGTGGGCTT-3’ and BCR-ABL R 6097 sbfl: 
5'-CCTGCAGGCTGCTACCTCTGCACTATGTCACTG-3’. Amplification pro- 
duct was subcloned into a pCR-XL-TOPO plasmid (Invitrogen Life Technologies) 
before being inserted into the SIV GAE-SSFV lentiviral transfer vector, followed by 
DNA sequencing. An SIV GAE-SSFV-eGFP vector was used as a control. The SIV 
vectors were produced as previously described™. 

CD34* cell transduction. Cells were suspended (1 X 10° ml~') in StemSpan 
(StemCell Technologies, France) supplemented with protamine sulphate 
(4 ug ml‘), SCF (100 ngml'), FLT-3-L (100 ngml'), IL-3 (20 ngml’), and 
IL-6 (20 ng ml‘), in a 96-well plate coated with RetroNectin (Takara Shuzo Co., 
Japan). Cell suspensions were incubated for 16 h. Lentiviral vectors were then 
added and cell suspensions incubated for 12 h. Cells were washed twice before 
being seeded. 

siRNA assays. siRNA targeting the human PPARy sequence 5'-TGTTCCGTG 
ACAATCTGTC-3’ were synthesized (Sigma-Aldrich Proligo). siRNAs targeting 
the human STATS sequence 5’-AAACTCAGGGACCACTTGC-3’, human HIF2x 
sequence 5'’-ATTAGAGCAAAGAGTCAGC-3’ and murine Cited2 sequence 5’- 
CGAGGAAGTGCTTATGTCCTT-3’ were synthetized (eurofins MWG/operon). 
CD34* BM cells were transfected with specific siRNA (25 nM) or control siRNA in 
the presence of Lipofectamine 2000 (Invitrogen) and maintained for 48 h before 
CFC assay. Control siRNA was purchased from Invitrogen Life Technologies 
(BLOCK-iT). Transfection efficiency was assessed using a fluorescein-labelled, 
double-strand RNA duplex (BLOCK-iT FluorescentOligo; Invitrogen). 

Human patients. Fresh bone marrow from patients with chronic-phase CML at 
diagnosis, umbilical cord blood cells from heathy donors, and blood or bone 
marrow samples from diabetes patients and patient given pioglitazone off-label 
were obtained with informed consent approved by the hospital’s Institutional 
Review Board (Comité de protection des personnes Ile-de-France XI) under 
approved protocol EudraCT number: 2009-011675-79. Imatinib levels were mea- 
sured in patients’ plasma using a previously described high-performance liquid 
chromatography (HPLC) method”. 

Statistical analysis. No statistical methods were used to predetermine sample size, 
the experiments were not randomized and the investigators were not blinded to 
allocation during experiments and outcome assessment. For culture assays and 
quantitative real-time PCR, values were calculated as mean + standard deviation 
for at least three separate experiments performed in triplicate. Paired and unpaired 
comparisons were made, using the nonparametric Wilcoxon rank test and the 
Mann-Whitney test, respectively. Limiting dilution analysis was carried out with 
L-Calc software (StemCell Technologies). All statistical analyses were carried out 
with StatView software (SAS Institute Inc., Cary, NC). 

Statistical information on samples described in figures. For all culture assays, 
paired and unpaired comparisons were made using the nonparametric Wilcoxon 
rank test and the Mann-Whitney test, respectively. Fig. 1a, b, plotted are means for 
CD34* cells from 4 CP-CML patients, 16 replica for each. Data with imatinib 
alone are not statistically different from those for the untreated control 
(P = 0.067), whereas pioglitazone as a single agent reduced LTC-IC frequencies 
by 2.4-fold (P = 0.008) or by 3.5-fold in combination with imatinib (P < 0.001). 
LTC-IC frequencies were established using L-Calc software (StemCell Techno- 
logies). Fig. 1c-e, All patients and statistical analysis are presented in Extended 
Data Table 1 (n = 6). Fig. 2a, STAT5B RT-qPCR normalized to GAPDH mRNA. 
Shown are means with standard deviations (s.d.) for 5 CP-CML patients. STAT5B 
mRNA levels decreased by 8.5-fold (P < 0.0001), 1.5-fold (P = 0.08) and 10.5-fold 
(P <0.0001) in the presence of pioglitazone, imatinib and the drug combination, 
respectively. Fig. 2b, Compared to imatinib alone, RT-qPCR analysis shows that 
addition of pioglitazone induces a significant reduction in mRNA levels by 3.3 and 
4.8 fold for BCL-x, and BCL-2, respectively, and 1.6 fold for PIM1 and CIS. mRNA 


LETTER 


quantification are normalized to GAPDH levels (n=5 CP-CML patients, 
*P < 0.05). Fig. 2c, Means with s.d. for 5 CP-CML patients. The effect of piogli- 
tazone was negated by an siRNA against PPARy (P = 0.043); siRNA validated in 
Extended Data Fig. 1c. Fig. 2d, Means with s.d. for 5 CP-CML patients. STATS 
overexpression counteract pioglitazone effect (P = 0.0047); LvSTAT5 validated in 
Extended Data Fig. 1c. Fig. 3a, results are normalized to GAPDH mRNA levels and 
represented relative to mRNA expression for the “Imatinib alone” condition 
(means of 11 patients with s.d. for each mRNA assessed). Fig. 3b, results are 
normalized to GAPDH mRNA levels and represented relative to mRNA express- 
ion of “untreated cells” (means of 6 patients with s.d.). As compared to untreated 
controls, cells treated with imatinib alone show a 6.2-fold increase of HIF2x 
relative to control (P< 0.011) and a 4.5-fold increase of CITED2 (P = 0.0277). 
The addition of pioglitazone reduced the imatinib-mediated HIF2u increase 
to 2.8-fold (P = 0.027). Pioglitazone significantly reduced HIF2« induction by 
2.2-fold (P = 0.027) and fully counteracted CITED2 induction. Pioglitazone alone 
has no effect compared to control. Fig. 4b, According to IS, relapse was declared 2 
consecutive positives, 6 months apart. For Extended Data Figures, statistical 
information is included in their legends. 

Statistical information regarding synergy determination. The putative syn- 
ergistic effect of multiple drugs was determined by the algorithm and definitions 
of the Chou-Talalay medium-effect method’’. The imatinib concentration 
required for 50% inhibition of the number of colonies obtained after CFC assay, 
ICs" was first determined in CD34" cells from 5 CP-CML patients at dia- 
gnosis. There was marked variability between patients (0.6 UM < ICs 9 < 2 uM with 
a median at 1 1M). Accordingly, at 1 4M imatinib concentration, the percentage 
inhibition was 47.1% + 24 of untreated CFC numbers in our full cohort of 29 CP- 
CML patients (Extended Data Figure 1). Percentages of inhibition were 65.7% + 26 
and 12% + 15 of untreated CFC numbers for pioglitazone alone (10 1M) and 
combination (imatinib 1M, pioglitazone 10 1M), respectively. A combination 
index (CI) <1 defined synergy. We calculated ICsg and CI values with the 
Calcusyn software (Biosoft, Cambridge, UK). Imatinib and pioglitazone were 
assumed as having independent modes of action. In these conditions, CI were 
always less than 0.248, thus indicating synergy between the two drugs. 


26. Davies, G. F., Juurlink, B. H. & Harkness, T. A. Troglitazone reverses the multiple 
drug resistance phenotype in cancer cells. Drug Des. Devel. Ther. 3, 79-88 (2009). 

27. Yuan, H. et al. Activation of stress response gene S/RT1 by BCR-ABL promotes 
leukemogenesis. Blood 119, 1904-1914 (2012). 

28. Pitulis, N., Papageorgiou, E., Tenta, R., Lembessis, P. & Koutsilieris, M. IL-6 and 
PPARy signalling in human PC-3 prostate cancer cells. Anticancer Res. 29, 
2331-2337 (2009). 

29. Chen, Y., Hu, Y., Zhang, H., Peng, C. & Li, S. Loss of the Alox5 gene impairs leukemia 
stem cells and prevents chronic myeloid leukemia. Nature Genet. 41, 783-792 
(2009). 

30. Kominsky, D. J. eta/, Abnormalities in glucose uptake and metabolism in imatinib- 
resistant human BCR-ABL-positive cells. Clin. Canc. Res. 15, 3442-3450 (2009). 

31. Lu, D. & Carson, D. A. Repression of beta-catenin signaling by PPAR gamma 
ligands. Eur. J. Pharmacol. 636, 198-202 (2010). 

32. Ito, K. et al. A PML-PPAR-6 pathway for fatty acid oxidation regulates 
hematopoietic stem cell maintenance. Nature Med. 18, 1350-1358 (2012). 

33. Gabert, J. et al. Standardization and quality control studies of ’real-time’ 
quantitative reverse transcriptase polymerase chain reaction of fusion gene 
transcripts for residual disease detection in leukemia — a Europe Against Cancer 
program. Leukemia 17, 2318-2357 (2003). 

34. Négre, D. etal. Characterization of novel safe lentiviral vectors derived from simian 
immunodeficiency virus (SIVmac251) that efficiently transduce mature human 
dendritic cells. Gene Ther. 7, 1613-1623 (2000). 

35. Roth, O. et al. Imatinib assay by HPLC with photodiode-array UV detection in 
plasma from patients with chronic myeloid leukemia: comparison with LC-MS/ 
MS. Clin. Chim acta. 411, 140-146 (2010). 

36. Chou, T. C. Drug combination studies and their synergy quantification using the 
Chou-Talalay method. Cancer Res. 70, 440-446 (2010). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
B 120, bm 
od | 7 
2 
ea | 
a <=10 
= 80, « + # se 
v * aes 
2 * | es 
S60 | ] +8 
2 <8 
5 ze | 
iTS of 
oO ay 5 
pal 
°° 
38 20 0 i 
PGI2 Tro Cig Ros Pio MCC si-PPARy - * -— - 
sh-PPARy - - - + 
c PPAR, Ab pan-STAT5 Ab HIF2a Ab 
si-PPARy LvSTAT5B si-STAT5S si-HIF2a 
- V ctr - V LvGFP - V ctrl - V Ctrl 
9  60kDa «ee - 90kDa BF -s0K00 t00KD0 
B-actin Ab i —~ soxDa ~—-40kD,  —4oKDa a - 40KDa 
15 | ‘ 1.0 41.0 
- 3 os 08 | 
én 2 06 0.6 
05 4 04 04 | 
03 0.2 02 
aoe Lam : ' 
- A Ctrl - A LVGFP - A Ctrl - A Ctrl 
si-PPAR-y LvSTATSB si-STATS si-HIF20. 


Extended Data Figure 1 | Clonogenicity assays in the presence of various 
PPARy agonists and validation of STAT5B overexpression and anti- 
PPARy, anti-STAT5 and anti-HIF2a siRNA. a, Clonogenic capacities of BM 
CD34" cells were assayed following pre-incubation for 2 days with culture 
medium alone (control) or supplemented with PPARy agonists, PGJ,, 
troglitazone (Tro), ciglitazone (Cig), rosiglitazone (Ros), pioglitazone (Pio) or 
MCC-555 (MCC) (25 UM each) (samples from 4 donors in triplicate). The 
number of colonies scored is expressed as percentage of control (untreated) 
values with standard deviation (s.d.), *P < 0.05 using the nonparametric 
Wilcoxon rank test. b, Validation of anti-PPARy siRNA used in Fig. 2b. CD34> 
cells were transfected with irrelevant or PPARy targeting siRNA (25 nM each). 
An anti-PPARy shRNA was used as a positive control. PPARy transcripts 
were normalized to GAPDH transcripts and expressed relative to the 

levels measured in untransfected cells. c, Western blot analysis with PPARy, 
pan-STAT5, HIF2o and anti-actin antibodies (Ab). Validation of siRNA 
against PPARy or STATS and lentivector expressing STAT5B (LVSTAT5B) 
were realized on CD34" cells from human UCB. Validation of siRNA against 
HIF 2a was realized on K562 cell line. Ctrl, scrambled siRNA; —, untreated. 
Quantification of western blot signal was realized with Image] software (http:// 
rsb.info.nih.gov/ij/). Histograms show mean values with s.d., n = 3. 


©2015 Macmillan Publishers Limited. All rights reserved 


a 
i 40 b a 120 I 
mateo p<0.0001 Broo 5 | p<0.0001 
o A408 iS 2 | 
as go + = 
= 2 80 3 | 
SE so | | Sai 
es Oo | 
oO Ss 40 B 40 5 
Wo 20 ae | 
Oe i 1 
Imatinib Ue 7 + - + - + 
- + - + 
Imatinib ims OR Ss Se a ee 
Pio - - + + Lv BCR-ABL 7 - - - + + + + 
c LTC-IC (LDA) d LTC-IC (LDA) . ‘ 
No. of Ph*CD34* seeded per well No. of Ph*CD34* seeded per well 
Bia 50. 100 150 200-250-300 100 a 
Go) SS = 7) Ss 
S oo 5 * = 
L 4 = a 
= vo "| SS 
6 2 eS 
3” e& ? Untreated N 
2 © Untreated > of anne = 
«= 10) ° Dasatinib a © Imani 
Oo 8/¥ Pio £10 © Rosi SS 
36 | ~~ Pio + Dasatinib é Ss 8 ] vy Rosi + Imatinib ~S 
~ : LTC -IC 
4 LTC-IC bad 3s U 
Frequencies cl Frequencies cl 
Untreated 1273 (17221 to 1/337) Untreated 11337 (1/264 to 1/430) 
Dasatinib 11385 (1/306 to 1/484) Imatinib 11369 (17294 to 1/463) 
Pioglitazone wie (11579 to 1/1045) Rosiglitazone 415 (1/537 to 1/950) 
Pioglitazone 1/1659 (1/1101 to 1/2498) Rosiginazclie 11390 (1/957 to 1/2044) 
Cl, 95% confidence interval C1, 95% confidence _ interval 
e ft 
CFSE Analysis D7 in culture CFSE Analysis D7 in culture 
cD34+ (P) Absolute number of CD34+ CP-CML cells that never divided (P) 
CD34+ or had divided only once 
= me Wes 88 untreated _Imat Pio _Imat+Pio JQ1_Imat#JQ1 
= imat+pio 879 112 = 
2 dasa gaat: 430 Patient 1 570 420 270 190 300 180 
c dasat pio 546 33 Patient 2 670 480 200 110 220 120 
N imat 560 160 
= imat + pio 80 0 
= dasa 800 160 
a dasat pio 240 8 


Extended Data Figure 2 | Differential and synergistic effects of pioglitazone 
and TKIs on CML cells. a, CFC assays with CD34* CP-CML cells from 
patients at diagnosis. Imatinib and/or pioglitazone were added for 48h before 
CFC assays. Means of 29 patients with standard deviation (s.d.). b, CFC 
assays after lentivector-mediated expression of BCR-ABL or eGFP (negative 
control) in human cord blood CD34* cells. Imatinib and/or pioglitazone 
were added for 48 h before CFC assays. Means of 3 individuals in triplicate 
with s.d. c, d, Limited dilution analysis (LDA) of CML LSCs by LTC-IC and 
frequency analysis. Plotted are means for CD34" cells from 2 CP-CML 
patients, 16 replica each. Imatinib 1 1M, Rosi 10 1M. e, f, CFSE analysis of 
CD34" cells (>96% Ph*) from CP-CML patients (for all experiments, 
imatinib 1 uM, dasatinib 0.146 LM, pioglitazone and rosiglitazone 10 uM, JQ1 
1 uM. imat, imatinib; dasa, dasatinib; pio, pioglitazone; (P), undivided). To 
confirm the pivotal role played by STATS5 in the mechanism of action of 
pioglitazone in eroding the pool of TKI-resistant CML-LSCs, we investigated 
here the effect of the bromodomain inhibitor JQ1, which inhibits the 
transcriptional function of STATS by decreasing its activity through targeting 
the bromodomain-containing protein 2 (BRD2), a key cofactor of STATS. 
Although this study with JQ] is corroborative, one cannot completely exclude 
the possibility that these effects are coincidental, as targeting BRDs may cause a 
series of effects independent of STATS. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


a [— iratinib b 
ie — Pio 33 140 
-_ |— Inratinib + Pio 233 120 
o a2 
s TwQec 
< ££ o 100 
3S 80 GEO 
i} 22a 8 * 
3 6 Bue iL 
3 One 
= £23 0 7 
ge ace wl + 
8 G 3 s TI * * 
=| ‘Oo a2 2 L 
‘o oS 
> 0 > 0 Ze ae 
DS D10 D15 CD38 + - + 7 + . 
Imatinib = Pio. ~_—‘Imatinib 
+ Pio 
c 1: Imatinib 15 min 2: Imatinib 30 min d 1 23 4 
P-STATSAMEFI an 
=87 STATS Ab San. 90kDa 
p-ctin As  %0: 
Ratios 1°15 1.2 0.2 
= 10° 10" 107 10° 
3 
1s) 3: Pio 72h 4: Pio 96h Control 
p-STAT5AMFI p-STATSAMFI 
=2 = 66 
10° 10! 10? 48! aa! 10 10° 10° 10! 10 10° 
STAT5 (Tyr694) 


Extended Data Figure 3 | Pioglitazone slowly decreases STAT5 expression 
whereas imatinib rapidly inhibits STATS phosphorylation. a, Differential 
kinetics of action of imatinib and pioglitazone. CD34” CP-CML cells 
(patient 4) in liquid culture in serum-free medium without cytokines. b, Rate of 
apoptosis in CP-CML cell populations after 4 days of culture with imatinib 
and/or pioglitazone (n = 5; *P < 0.05). Solid bars (black for CD38” and grey 
for CD38"), percentage of recovery relative to input and normalized to 
untreated controls. Hatched bars, percentage of apoptosis, defined by the 
expression of annexin V. c, Flow cytometry analysis of permeabilized K562 cells 
with IgG against phosphorylated (Tyr694) STATS. Untreated (black) and drug 
treated (red or blue). Control panel, no drug treatment but irrelevant IgG 
isotype control (grey peak). d, Western blot analysis with pan-STATS and 
anti-actin antibodies, showing a decrease of STATS by 3.5 fold + 0.5 (s.d.) in 
lane 4 (n = 3). Lanes 1 and 2 for imatinib (15 and 30 min exposure, 
respectively); lanes 3 and 4 for Pio (72 and 96 h exposure, respectively). 

Ratio indicates ratio of STAT5 expression/f-actin expression relative to lane 1. 
Quantification of western blot signals (n = 3 for each condition) was 

realized with Image] software (http://rsb.info.nih.gov/ij/). 


©2015 Macmillan Publishers Limited. All rights reserved 


CD34* cells 


at D10 of culture 


at D7 of culture 


Pio+ 
LvGFP 


Pio + 
LvSTATS 


< 
Number of mitosis (CFSE) 


b CD34+ cells distribution (%) 
Divisions 
ea a 
Pio+LvGFP gg 47 14272 2 158 TA 53 
Pio+LVSTATS gg 44 34 46 156 334 29 87 
PlotWGP 47 104 184 213 184 154 8 24 08 
Pio+LVSTATS 46g 43 105 21 135 93 93 83 
p10 
Cs d 
o 
a| § 
= 
wot 
g “ 67% j|, 
Zz? a } | yh 
E "lt of 
22 | | 
e ; / \ 
4 S a. 
E i VV 
” 
0 ' T ‘ 
- . Baie bie ae 
Pio+ Pio + a a 
LvSTATS. LvGFP GFP 


LvGFP D10 LvGFP + Pio D10 LvSTATS5 + Pio D10 
Patient CD34 Undivided Undivided CD34 Undivided 
(x10°) (P) (P) (x10°) (P) 
2 4 84 10 3.4 240 
8 21 101 42 2 185 
9 3.9 348 48 a7 480 


Extended Data Figure 4 | Forced expression of STATS in CP-CML CD34* 
cells increases the compartment of quiescent cells. a, CFSE analysis of 
CP-CML CD34" cells treated with pioglitazone after transduction with 
lentivectors (Lv) expressing eGFP or STAT5B, whose transcription is PPARy- 
independent. Representative CP-CML patient 2 in triplicate (data for all 
patients are in Extended Data Fig. 2e). One coloured peak for each cell division 
number. P, colcemid arrested “parent-cells”. b, Distribution (%) of CD34" cells 
in each division peak shown in Extended Data Fig. 3a. c, STATS mRNA 
expression analysis. d, Transduction efficiency of STATS lentivector. (5 replica 
with s.d.). e, Data for the 3 patients tested. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


No. of CD34* cells per well 
50 100 150 200 250 300 


100 ms 

50 ° 

65% 
25 
10; WCD34* 
5 | ACD34* + Pio 
@ Ph*-CD34* 
2.5 | @ Ph*-CD34* + Pio 
Reduction in LTC-IC 


1: frequencies 


Extended Data Figure 5 | High toxicity of pioglitazone for CML LSCs vs. 
low toxicity for normal HSCs. LTC-IC (LDA) showing differential 
toxicity of pioglitazone for CP-CML vs. normal CD34" cells (n = 3, 16 
replica for each). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 500 
co 450 
a ry 
on .] 
s ~ 400 
8 ir 
— 4- 
zr = _ 
Q 2- 
fe) _ 
0 
> Imatinib + + 
ocm1 Lvocti - + 
c 
12000 
ep34* - 500 = 
= Lox 
~ 10000 mm cosa" + = 
ir L 400 
ao i= 
S& 8000 | a e i 3 
= L 300 Ss. 
Q 6000 | I 3 
2 L 200 Oo 
S 4000 | oe a [ S 
" 100 * 
2000 | al r ° 
a": 
0 = a 
Imatinib = “ + + = - + + 
Lvoctlt1 - * - # = + = + 
D5 D8 


Extended Data Figure 6 | Erosion of undivided and imatinib-resistant 
CD34* CP-CML cells is OCT1-independent. a, Efficiency of LYOCT1 
transduction (D8). b, OCT1 mRNA expression. Results are normalized to 
GAPDH mRNA levels and represented relative to mRNA expression for the 
“imatinib alone” condition. c, CFSE analysis and absolute cell count in the 
presence of imatinib, with or without OCT1 overexpression. Left scale (black), 
total cells showing CD34* vs. CD34 cells (histograms). Right scale (red), 
undivided CD34" cells (red dots) (representative for n = 3 CP-CML patients). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a untreated Imatinib b CD34+ cells distribution (%) 
P P Divisions 
S 9 a 7 6 5 4 3 2 4 P 
r3) sIRNA control 0 oO 2.2 84 18.8 17.1 178 13.2 88 13.7 
$ $ siRNA STATS oO o 3.3 13.2 239 19 138 95 69 | 92 
= x SIRNA HIF2a 417° 012 35 165 228 176 133 105 7 86 
3} Pio oO oO 12 25 36 965 13.7 224 19.5 303 
5 sIRNA control 
Aa rt + Imatinib oO oO o 06 #12 32 43 7.6 162 65.7 
a I gee ea 0 0 0 05 18 7 4 96 168 596 
~~ & siRNA HIF2a 
Cow + imatinib o o o o 18 56 67 9 151 619 
2a 
32 
of = 
+ ° c. F 100 d 2 Pi STAT5SA STATSB 
« 2 
ag zg 2 | 
ZL EE Er* 
w 
3 gs ° ES: 
z EL x & fos 1 
a : arpa os os , OD @ 
Number of mitosis (CFSE) siHIF2a - + siSTAT5 - + - + 
e siCtl D7 ‘SiSTAT5S D7 siHIF2a D7 Imatinib +siC tr Imatinib + siSTATS Imatinib +siHIF2a_ Plo+siCtri Imatinib + Pio + siCtr 
a cD34 CD34 CD34 CD34 cD34 cD34 CD34 CD34 
Patient nan 7) gcse prey ©) gete ——pcrety octet ois! oe, 
8 25 210 18 176 24 189 09 390 02 168 0.18 161 03 149 0.18 180 
10 24 (238 bd 221 18 209 o7 324 03 168 03 183 06 201 0.3 200 
11 25 4 24 27 2 30 03 112 0.23 66 022 73 24 21 026 43 
12 bed 48 og 24 Os 34 0.22 39 0.16 17 018 24 1 414 021 18 
13 33 27 24 17 2.6 14 0.08 “4 0.08 11 0.07 12 0.27 1 0.04 8 
mean 1114 928 95.2 179.8 84 88.6 792 89.8 
(P) Undivided cells * * ns *# ae ad ae 
f CD34+ cells distribution (%) 
LvGFP LvHIF2a g Divisions 
eo 8 7 6 5 4 3 2 1 
RK Pp LvGFP 0 (0 OsCée 18.7 36 258 12 «5 18 
S LvHIF2a 0 0 0 0 07 84 27 323 223 93 
a 
2 h is 1 
@ i Ny zm 20 
oe € |/ a £ 15 
3 3 /\\ Roy 
a ° * 2 
Oo Pai = 1 L 
ee @ 7 ¥ ” m z Oo 
Number of mitosis (CFSE) HIF2 Be 
umber of mitosis kot 
LvHIF2a - + 


Extended Data Figure 7 | The viability of undivided (P) and imatinib- 
resistant CD34* CP-CML cells depend on HIF2a expression. Representative 
CP-CML patient 8 in triplicate (data for all patients are in Extended Data 
Fig. 6e). a, CFSE analysis in presence of siRNA against STATS or HIF2« in 
CD34*-Ph* cells treated or not with imatinib. One colored peak for each cell 
division number. P, colchemid arrested ‘parent-cells’. b, Distribution (%) of 
CD34* cells in each division peak. c, HIF2x mRNA expression 72 h after 
siHIF2« transfection. d, STAT5 A and B mRNA expression 72 h after siSTAT5 
transfection into human UCB CD34* cells. Results are normalized to 
GAPDH mRNA levels (means of 5 experiments with s.d. for each gene 
assessed). e, Data for the 5 patients tested (*P < 0.05 relative to siCtrl; #P < 0.05 
relative to Imatinib + siCtrl). f, CFSE analysis of cord blood CD34" cells 
after transduction with lentivectors (Lv) expressing HIF2o or eGFP. 

One coloured peak for each cell division number. P, colchemid arrested ‘parent- 
cells’. g, Distribution (%) of CD34" cells in each division peak (m= 5). 

h, Transduction efficiency of HIF2« Lv. i, HIF2x mRNA expression (means 
of 5 experiments with s.d.). 


©2015 Macmillan Publishers Limited. All rights reserved 


a b 
1et5 
20 mmm STATS mum StatS 
> HIF-1ta T 
[a HIF-20 ter4 
mm CITED2 


a 
1 


=> Hif-a 
i Hif-20 
Wm Cited2 


mRNA Fold amplification 
(relative of control) 
ag 
sé 


mRNA Fold amplification 
(relative of control) 


= 
— 


o! 
LvBCR-ABL - + + + cs + e 


+ 
Imatinib - - + + + + ae: BaF3 AY BAF3 
shee an Cee BCR-ABL Bt 
siRNA Ctr - - - + - - . + 

siRNA Hif2a - - - - + - 

siRNA STATS - - - - - e 


Extended Data Figure 8 | Expression of target genes in CD34* cells and Ba/ 
F3 cell line CML-models. a, mRNA expression of target genes in CD34* 
cells from UCB transduced or not by BCR-ABL expressing lentivector (Lv). 
BCR-ABL* cells were cultured in serum-free medium without cytokines for 7 
days with either imatinib alone (1 |1M) or imatinib and pioglitazone (1 uM and 
10 1M, respectively) (means of 5 experiments with s.d. for each gene assessed). 
Results are normalized to GAPDH mRNA levels and represented relative to 
mRNA expression for the ‘untreated’ condition. Overexpression of BCR-ABL 
in CD34" cells from umbilical cord blood induced expression of STATS and 
HIF1a% mRNAs by 2.7- and 2.8-fold, respectively (P = 0.043), while HIF2a 
and CITED2 mRNAs were increased by 12.5-fold and 9-fold (P = 0.043), 
respectively. In the presence of imatinib, STATS and HIFla mRNAs 

were increased by 7.5- and 4.9-fold (P = 0.043), respectively, while HIF2« and 
CITED2 were increased by 19.3- and 22-fold (P = 0.043), respectively. Either 
pioglitazone or an siRNA against STAT5 (A and B) significantly reduced 

the levels of HIF2~ and CITED2 mRNAs, while an siRNA against HIF2« 
significantly reduced CITED2 mRNA expression (>threefold each, P < 0.05). 
b, mRNA expression of target genes in Ba/F3 cell sub-lines independent of IL3 
for viability after transduction with LVBCR-ABL or constitutively activated 
Stat5A1*6 (A*) or Stat5B1*6 (B*). Results are normalized to GAPDH mRNA 
levels and represented relative to mRNA expression for the original Ba/F3 
cell line (means of 5 experiments with s.d. for each gene assessed). Forced 
expression of BCR-ABL increased the level of murine endogenous Stat5 (a and 
b) mRNAs by 2.7 fold (P = 0.043). When BCR-ABL or constitutively activated 
murine Stat5 1*6 (a or b) were overexpressed, murine endogenous Hifla 
mRNA level was decreased by threefold (P = 0.043) and murine endogenous 
Hif-2x and Cited2 mRNAs increased by more than eightfold each (P = 0.043). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


» 
tom 


s 


mRNA fold amplification 
(relative of control) 
—a 
mRNA fold amplification 


° 
= 


Untreated Imatinib Pio Imatinib 
+Pio 


BAF3 BAF3 BCR-ABL BAF3 A* 


Irrelevant si-RNA si-Cited2 
EdU (%) 


BAF3 
BAF3 


BAF3 A* 
BAF3 A* 


BAF3 B* 
BAF3 B* 


Extended Data Figure 9 | The key regulator of HSC quiescence, CITED2, is 
overexpressed in TKI-resistant CD34* cells from CP-CML patients. 

a, mRNA expression of CITED2 and target genes thereof BMI1, HES1 and p57 
after 9 days of culture with or without imatinib and pioglitazone. Results are 
normalized to GAPDH (n = 4). b, mRNA expression of endogenous 

murine Cited2 and its target genes Bmil, Hes1 and p57 in Ba/F3 cell line with 
or without forced expression of BCR-ABL or constitutively active Stat5A 1*6 
(A*) in the presence or not of siRNA against Cited2. Results are normalized to 
GAPDH (mean = s.d. of 3 independent experiments in triplicate). Forced 
expression of a constitutively active form of murine Stat5 1*6 (A or B) in Ba/F3 
cells, in and of itself, was sufficient to increase endogenous expression of murine 
Cited2 markedly (52-fold) as well as that of its target genes Bmil1 (2.5-fold), 
Hes] (13-fold) and p57 (18-fold) c, Proliferation analysis by EdU incorporation 
assay of the Ba/F3 cell line that expresses or not constitutively active forms 
of Stat5 A 1*6 (A*) or B 1*6 (B*) in the presence or not of siRNA against Cited2 
(representative result of 5 independent experiments). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | CFSE analysis of CD34* cells (>96% Ph*) from 6 CP-CML patients after liquid culture without cytokines 


Untreated D10 Imatinib D10 Pio D10 Imatinib + Pio D10 

. CD34 Undivided CD34 Undivided CD34 Undivided CD34 Undivided 
Patient 3 3 3. 

(x 10°) (P) (x10°) () (x10°) () (x10°) (P) 
2 13 310 40 410 6.0 60 08 45 
3 3.6 44 0.6 60 28 22 05 15 
4 15 129 3.6 150 6.0 69 23 27 
5 3.0 99 1.0 132 0.7 60 05 81 
6 15 128 24 63 9.0 35 13 48 
4 7.8 220 1.3 251 18 153 08 107 

mean 155 177.6 66.5 53.8 

a 
P=0.248 P=0.027 Se 


©2015 Macmillan Publishers Limited. All rights reserved 


Pd ss 


doi:10.1038/nature14985 


The spliceosome is a therapeutic vulnerability 


in MYC-driven cancer 


Tiffany Y.-T. Hsu’**4, Lukas M. Simon‘, Nicholas J. Neill'*, Richard Marcotte®, Azin Sayad°, Christopher S. Bland'*, 

Gloria V. Echeverria®”"’, Tingting Sun, Sarah J. Kurley*, Siddhartha Tyagi’, Kristen L. Karlin''*, Rocio Dominguez-Vidaha’*"*, 
Jessica D. Hartman‘*t, Alexander Renwick*, Kathleen Scorsone’, Ronald J. Bernardi’, Samuel O. Skinner!!®, Antrix Jain!, 
Mayra Orellana'*, Chandraiah Lagisetti'’, Ido Golding’, Sung Y. Jung’, Joel R. Neilson”°, Xiang H.-F. Zhang”, 

Thomas A. Cooper®”’8, Thomas R. Webb", Benjamin G. Neel®’, Chad A. Shaw* & Thomas F. Westbrook)?:4 


MYC (also known as c-MYC) overexpression or hyperactivation is 
one of the most common drivers of human cancer. Despite intensive 
study, the MYC oncogene remains recalcitrant to therapeutic 
inhibition. MYC is a transcription factor, and many of its pro- 
tumorigenic functions have been attributed to its ability to regulate 
gene expression programs’ *. Notably, oncogenic MYC activation 
has also been shown to increase total RNA and protein production in 
many tissue and disease contexts*’”. While such increases in RNA 
and protein production may endow cancer cells with pro-tumour 
hallmarks, this increase in synthesis may also generate new or heigh- 
tened burden on MYC-driven cancer cells to process these macro- 
molecules properly*. Here we discover that the spliceosome is a new 
target of oncogenic stress in MYC-driven cancers. We identify 
BUD31 asa MYC-synthetic lethal gene in human mammary epithe- 
lial cells, and demonstrate that BUD31 is a component of the core 
spliceosome required for its assembly and catalytic activity. Core 
spliceosomal factors (such as SF3B1 and U2AF1) associated with 
BUD31 are also required to tolerate oncogenic MYC. Notably, 
MYC hyperactivation induces an increase in total precursor messen- 
ger RNA synthesis, suggesting an increased burden on the core 
spliceosome to process pre-mRNA. In contrast to normal cells, par- 
tial inhibition of the spliceosome in MYC-hyperactivated cells leads 
to global intron retention, widespread defects in pre-mRNA mat- 
uration, and deregulation of many essential cell processes. Notably, 
genetic or pharmacological inhibition of the spliceosome in vivo 
impairs survival, tumorigenicity and metastatic proclivity of 
MYC-dependent breast cancers. Collectively, these data suggest that 
oncogenic MYC confers a collateral stress on splicing, and that 
components of the spliceosome may be therapeutic entry points 
for aggressive MYC-driven cancers. 

To discover genes and cellular processes required to tolerate onco- 
genic MYC expression, we previously performed a genome-wide 
MYC-synthetic lethal screen in human mammary epithelial cells 
(HMECs) engineered with an inducible MYC and oestrogen receptor 
fusion protein (MYC-ER) for candidates affecting cell viability in a 
MYC-selective manner’. This screen nominated BUD31 as a candidate 
MYC-synthetic lethal gene (Fig. 1a), in which barcoded BUD31 short 
hairpin RNAs (shRNAs) consistently dropped out of the population in 
MyYC-hyperactivated cells relative to cells without MYC induction 
(Fig. 1b). In validation experiments, BUD31 depletion restrained clo- 
nogenic growth and activated apoptosis in MYC-induced cells, as com- 
pared to MYC-normal cells (Extended Data Fig. la—c). Expression of 
shRNA-resistant BUD31 rescued the MYC-synthetic lethal phenotype 


of BUD31 shRNA (Fig. 1c and Extended Data Fig. 1d), indicating that 
the phenotype is an RNA interference (RNAi) on-target effect. 

BUD31 has been linked to the spliceosome in yeast”®, but its func- 
tion in mammalian systems has not been determined. To uncover the 
molecular function(s) of BUD31, we identified BUD31-interacting 
proteins by Flag-tagged BUD31 immunoprecipitation from cells with 
or without RNase A (which eliminates protein-protein interactions 
mediated by RNA tethering), followed by mass spectrometry. 
Remarkably, 79 out of 134 core spliceosomal components were assoc- 
iated with BUD31 (Extended Data Fig. 2a), suggesting a strong asso- 
ciation between BUD31 and the spliceosome in human cells. 

The spliceosome is a dynamic molecular machine consisting of sev- 
eral nuclear protein complexes that cycle on and off of pre-mRNA 
during intronic splicing’. Co-immunoprecipitation experiments 
confirmed that BUD31 associates with several subcomplexes of the 
spliceosome, including the Prp19-CDC5L subcomplex (PRPF19), the 
U2 small nuclear ribonucleoprotein particles (snRNPs; SF3B1 and 
SF3A1), U2-related factors (U2AF1), the U5 snRNP (EFTUD2), and 
Sm proteins (SNRPF) (Fig. 1d and Extended Data Fig. 2c), but inter- 
action with non-spliceosomal proteins was not detected (Extended Data 
Fig. 2d, e). To test more broadly the association of BUD31 with sub- 
complexes of the spliceosome, we performed bimolecular fluorescence 
complementation (BiFC) between BUD31 and proteins from each 
major spliceosomal subcomplex. BiFC analysis indicated that BUD31 
associates with components of the major snRNPs (U1, U2, U4/U6 and 
U5) as well as Sm proteins (Fig. le and Extended Data Fig. 2b), indi- 
cating that BUD31 is present at several stages of spliceosomal assembly. 

To examine more directly whether BUD31 has a role in pree mRNA 
splicing, we tested in vitro splicing efficiency using nuclear extracts 
with or without BUD31 knockdown. BUD31 loss significantly inhib- 
ited pre-mRNA splicing (Extended Data Fig. 2f-i). In addition, knock- 
down of BUD31 led to defects in early spliceosome assembly, as 
indicated by impaired formation of complex A (Extended Data Fig. 
2h, i). Collectively, these data indicate that HMECs require a core 
spliceosomal protein (BUD31) to tolerate dysregulated MYC. 

We proposed that cells with oncogenic MYC required BUD31 for 
cell survival because of its role in the spliceosome. To test this hypo- 
thesis, we generated a BUD31 mutant deficient in binding core spli- 
ceosomal proteins by mutating a highly conserved region spanning a 
C,-C, zinc-finger. Mutation of this region abrogated BUD31 inter- 
action with spliceosomal proteins (Extended Data Fig. 2j). To deter- 
mine whether this region is also necessary for cells to tolerate MYC 
hyperactivation, we performed an in vitro competition assay. Green 


Verna & Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030, USA. “Interdepartmental Program in Molecular and Biomedical Sciences, 
Baylor College of Medicine, Houston, Texas 77030, USA. Medical Scientist Training Program, Baylor College of Medicine, Houston, Texas 77030, USA. “Department of Molecular and Human 

Genetics, Baylor College of Medicine, Houston, Texas 77030, USA. °Princess Margaret Cancer Centre, University Health Network, Toronto M5G 2C4, Canada. “Department of Molecular Physiology and 
Biophysics, Baylor College of Medicine, Houston, Texas 77030, USA. Department of Pathology and Immunology, Baylor College of Medicine, Houston, Texas 77030, USA. ®Department of Molecular 
and Cellular Biology, Baylor College of Medicine, Houston, Texas 77030, USA. °Department of Pediatrics, Baylor College of Medicine, Houston, Texas 77030, USA. !°Department of Physics, University of 
Illinois, Urbana, Illinois 61801, USA. !+Center for Chemical Biology, Bioscience Division, SRI International, Menlo Park, California 94025, USA. 12The Lester and Sue Smith Breast Center, Baylor College of 
Medicine, Houston, Texas 77030, USA. 13Department of Medical Biophysics, University of Toronto, Toronto M5S 2J7, Canada. +Present address: Humacyte, Morrisville, North Carolina 27560, USA. 


384 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a MYC-hyperactivated b = Control c = Control 
MYC-normal (+4-OHT) > a = MYC = MYC+shBUD31 
2 ee 
) 2 . 
5 8 1004 
YL \QsshBubst iy \tehBupat 2 -0.4 5 75 
5 G * © 1. 
o 
8 ZB 254 
MYC- | MYC-normal MYC- MYC-hyperactivated g 124 == Le 2 o4 
norma! BUD31- -depleted hyperactivated BUD31-depleted 1 2 3 DNA: GFP BUD31 
(synthetic lethal) BUD31 shRNA 7 : ak 
; f = 607 
d 1% input IP: Flag e f rs 
Flag- Flag- 40 2 404 
Control BUD31 Control BUD31 = a 
RNase: - + - + - + - + & 30 o 20) 
——me ee | 18: SFSB1 @ 20 0+ 
—-— eon oe ee) IB: PRPF19 aA shBUD31:; - + + + 
“ 40 ee 
se — jie: vert > cee het ee 
7 tee 
IB: SF3A1 0 g 3 120 cd 
Ei ie I £a8) I] 99 ey 
Sere IB: EFTUD2 VE 2 =Vehi 
= 804 ehicle 
ic ™10nM SD6 
iBSSNBPE = Sm proteins m™ U4.U6 snRNP 3 ™20 nM SD6 
: = = U1 snRNP. m=Lsm proteins 2 40, 
D. rw | m1 U2 snRNP US snRNP zy 
OO | 18: Flag = Prp19-CDC5L_m Second step & Control MYC 
h & 204Control MYC is Control MYC JE 49) Control Myre kK &  49.Control MYC 
eo imes: zo : le Bolsa 
— 0 = e mn i e OTT 
5 5 5 5 
= a = ~10 E -10 — -10 - 
8 ~?07 o-shSF3B1 8 Jm-shu2Ar1 8 oo Ja-sneFTube @  -204a-shSNRPF 
a m+shSF3B1 > 720-7m+shU2AF1 3 m+shEFTUD2 bs im +ShSNRPFI 
D> -40 i=) D> -30 D> -30 
ra ra © < 
o a o a 
a] a 6 30 we 6 -40 Poni 5 -40 +e 


Figure 1 | The spliceosome is required for cells to tolerate oncogenic MYC 
hyperactivation. a, BUD31 is a MYC-synthetic lethal gene. b, BUD31 shRNA 
(shBUD31) barcode abundances with/without MYC-ER hyperactivation 
(mean + s.e.m., n = 3 biological replicates). c, Relative number of MYC-ER 
HMECs with dox-inducible shRNA targeting the 3’ untranslated region (UTR) 
of BUD31, and constitutive shRNA-resistant Flag~GFP or Flag~-BUD31 
expression (mean + s.e.m., n = 4 technical replicates). d, Flag-BUD31 
co-immunoprecipitation for core spliceosomal factors. e, Interaction between 
BUD31 and spliceosomal proteins assessed by BiFC (mean + s.e.m., n = 3 


fluorescent protein (GFP)-expressing MYC-driven breast cancer 
cells encoding inducible BUD31 shRNA were transduced with 
shRNA-resistant wild-type or mutant BUD31 complementary DNA, 
and these cells were mixed with non-transduced, GFP-negative cells. 
BUD31 knockdown significantly inhibited the proliferation of MYC- 
driven cancer cells. Proliferation was fully rescued by wild-type BUD31 
cDNA but not by a BUD31 mutant deficient in spliceosomal binding 
(Fig. 1f), suggesting that BUD31 association with the spliceosome 
is required to support the survival of MYC-hyperactivated cells. 
More broadly, these results indicate that oncogenic MYC may increase 
cellular dependency on spliceosome function. By contrast, ectopic 
expression of the oncogenes HER2 (also known as ERBB2) and 
EGFR did not enhance the effects of BUD31 depletion (Extended 
Data Fig. 3a, b), suggesting that the stress imposed by MYC on spli- 
ceosomal function is not a universal feature of the oncogenic state. 

To test whether one or more subcomplexes of the spliceosome are 
required to tolerate aberrant MYC activity, we examined additional 
components of spliceosome assembly and catalysis including SF3B1 
(U2 snRNP), U2AF1 (U2-related splicing factor), EFTUD2 (U5 
snRNP) and SNRPF (core Sm protein found in every snRNP complex). 
Notably, partial depletion of each spliceosomal component led to loss of 
cell viability (Fig. Ih-k and Extended Data Fig. 4a-d) and increased 
apoptosis (Extended Data Fig. 4e-h) in MYC-hyperactivated cells. This 
suggests that several subcomplexes of the core spliceosome are required 
for cells to tolerate oncogenic MYC, and that MYC-hyperactivated cells 
are sensitive to modest perturbations in spliceosome function. 

Next, we investigated whether pharmacological inhibition of the 
spliceosome is also synthetic lethal with MYC. Several pharmacological 
agents (for example, FR901464, pladienolides and their derivatives) 
have been characterized to bind the core SF3b spliceosomal complex 
components and inhibit spliceosome function'”. However, most of these 


technical replicates). f, GEP* MYC-dependent cells with inducible 
shBUD31-UTR and constitutive wild-type, mutant BUD31, or negative control 
cDNA expression were mixed with GFP cells and passaged (mean + s.e.m., 
n = 8 technical replicates, two-tailed Student’s t-test). g, Change in MYC-ER 
HMEC clonogenicity after SD6 treatment (mean = s.e.m., n = 4 technical 
replicates, two-tailed Student’s t-test). h-k, Relative number of MYC-ER 
HMECs after partial depletion of core spliceosomal proteins (mean + s.e.m., 
n= 4 technical replicates, one-way analysis of variance (ANOVA)). 

**P < 0.01, ***P < 0.001. 


inhibitors are not amenable for in vivo delivery. We developed a new 
small molecule inhibitor of SF3B1, known as SD6, that impairs spliceo- 
some function and is bioavailable in mammals’’. Consistent with our 
genetic data, low SD6 concentrations significantly suppressed colony 
formation (Fig. 1g) and induced apoptosis (Extended Data Fig. 4i) in a 
MYC-selective manner. The synthetic-lethal interaction between MYC 
hyperactivation and core spliceosome perturbation suggests that pre- 
mRNA splicing is necessary to tolerate oncogenic MYC. 

In many different cell lineages and experimental systems, oncogenic 
MYC activation has been shown to amplify the synthesis of cellular 
mRNA through direct or indirect mechanisms**’*”*. In agreement, 
MYC hyperactivation in HMECs increased total cellular mRNA syn- 
thesis and mRNA steady-state levels (Fig. 2a) without an increase in 
cellular growth rate (Extended Data Fig. 3c). In contrast to a recent report 
in B-cell compartments’*, MYC hyperactivation did not affect the levels 
of spliceosome proteins in HMECs (data not shown), suggesting that 
increased pre-mRNA dosage is not compensated for by higher spliceo- 
some levels. Thus, we proposed that the MYC-induced increase in global 
mRNA synthesis confers increased pressure on the spliceosome to pro- 
cess pre-mRNAs, and partial perturbation of the spliceosome would lead 
to widespread defects in the splicing of pre-mRNA introns in the MYC- 
hyperactive state. To test this hypothesis, we compared intron retention 
(IR) after BUD31 knockdown in MYC-normal or MYC-hyperactivated 
cells. We performed RNA-sequencing (RNA-seq) from cells in each state 
(normal, BUD31 knockdown, MYC-hyperactive, and MYC-hyperactive 
with BUD31 knockdown) and determined the pre-mRNA splicing effi- 
ciency by calculating IR at junctions across the genome (Fig. 2b). Because 
the analysis of intronic reads may be influenced by the presence of stable 
RNAs within introns and/or spliced lariats, we restricted the analysis to 
reads directly spanning exon-intron or exon-exon junction sequences 
(75,623 junctions in 6,861 genes) (see Methods). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 385 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a_ c dto0 Figure 2 | In MYC-hyperactivated cells, 
cy a = 2 Ser a Piven taasiaes perturbation of the spliceosome leads to global 
2-3 zs s 8 2 intron retention. a, Left, total poly(A)* RNA per 
Zo e 5 5 a cell (10° *ng). Right, newly synthesized 4-sU- 
Sy 8 2 4 2 2 40 labelled poly(A) RNA per cell (107° ng). Data are 
3s ; 3 = 8 20 ee mean + sem. n=4 technical replicates for both 
Control MYC ae MYC 5 & 9 2 assays, two-tailed Student s t-test. b, Schematic of 
b a 7 1 4 5 I IR analysis. ¢ d, Empirical cumulative distribution 
El —____lExen Junction IR after BUD31 depletion Gene IR after BUD31 depletion of IR coefficients for 75,623 exon-intron Junctions 
Exon-intron read —— a (c) or 6,861 genes (d). Curves represent IR 
ena eee pee een meet differences after BUD31 depletion in MYC-normal 
intron retention) ele e «16 * fae g.° oe and MYC-hyperactive states. A rightward shift in 
0) iste 2 2 | 12 = 5 the MYC-hyperactive curve indicates increased IR 
wien ea z ae 8 gi ae & Ee 24 (Kolmogorov-Smirnov test). e-g, logo-fold 
35 © £ changes in junction IR relative to untreated by 
) = B04 =) Eg ' | RNA-seq of representative genes (mean + s.e.m., 
‘wore a oo & 5 I n= 3 biological replicates, two-tailed Student’s 
BUD31 knockdown h<; ne a . 5h -_ t-test). h-j, RT-PCR validation showing 
Fe 7 x £ =, fold change in junction IR relative to untreated 
] a = 5 z : ih 64 = | (mean + s.d., n = 3 biological replicates, two-tailed 
MYC-hyperactivated = = pa = : f 4 = 5 Student’s t-test). *P < 0.05, **P < 0.01, 
& 8 i 3 2] 8 ***P < 0.001. 
—_ Exon-intron § 1 e ee 
Sh a, le reads (unspliced) 5 0 FS 0 | 5 ‘ 
ae Exon-exon 


MyYC-hyperactivated _ 
BUD31 knockdown 


shBUD31 MYC+ 


reads (spliced) shBUD31 


To examine the effects of spliceosome perturbation in the normal 
and oncogenic MYC states, we compared the effect of BUD31 knock- 
down on junction IR coefficients in wild-type and MYC-hyperacti- 
vated cells. Notably, BUD31 depletion caused significantly more IR in 
the MYC-hyperactive state than in the MYC-normal state (Fig. 2c, 
P<10 *”). Similar results were observed when junction coefficients 
were computed on a gene level (Fig. 2d, P< 107 '®”). The increase in 
IR conferred by aberrant MYC activation and BUD31 shRNA was 
validated on individual exon-intron junctions via quantitative reverse 
transcriptase PCR (qRT-PCR) (examples in Fig. 2e-j). IR was not 
limited to a few discrete genes. Instead, BUD31 knockdown in the 
MyYC-hyperactive state led to significantly increased IR in 42% of 
genes analysed (2,848 of 6,861, P< 0.05). These data indicate that 
the combination of oncogenic MYC activation and partial spliceo- 
some inhibition leads to a widespread increase in IR. This is consist- 
ent with the hypothesis that the MYC-induced increase in pre-mRNA 
synthesis enhances cellular dependency on optimal spliceosome func- 
tion by raising the level of pre-mRNA substrates for spliceosomal 
processing. 


shBUD31 MYC+ 
shBUD31 


‘shBUD31 MYC+ 
shBUD31 


Intron-retaining pre-mRNAs often fail to complete mRNA matura- 
tion and are commonly degraded via quality control mechanisms’’. 
Because the combination of MYC hyperactivation and spliceosome 
inhibition led to a global increase in intron retention (Fig. 2c, d), we 
proposed that these cells may contain widespread defects in pre-e mRNA 
maturation and stability (Fig. 3a). To test this hypothesis, we measured 
the levels of cellular poly(A)* RNA in each of the four states (with/ 
without MYC hyperactivation, with/without BUD31 shRNA) before 
and after treatment with the transcriptional inhibitor actinomycin D. 
After actinomycin D treatment, cellular poly(A)” RNA decreased by 
comparable levels (~ 16-19%) in control cells with or without BUD31 
knockdown (Fig. 3b). Notably, MYC-hyperactivated cells exhibited 
enhanced mRNA stability, perhaps resulting from increased polysomal 
loading of mRNA during MYC-induced translation’*. By contrast, cells 
containing MYC hyperactivation and BUD31-depletion exhibited a 
substantially greater loss (38%) of poly(A)* RNA after actinomycin 
D treatment, suggesting a defect in pre-mRNA maturation and/or 
stability in the combined MYC-hyperactivated and BUD31-shRNA 
state. Similarly, fluorescence in situ hybridization (FISH) measurements 


ag Partially b Control MYC e 
= Spliced inhibit 0 
| mRNA splicing aS 
ee 
9 > = 10 i 
= 8< -20 Q 
Intron-retained pre-mRNA £6 — =I 
9 ¥% 30] NS 2 
2 : sc 
a) Partially &Z -40 
= ‘ oc 
Of inhibit Oo 
=8 splicing -50 had 0 
25 : = -shBUD31 shBUD31 -+ -—+ -+ -+ -+ -+ -4+ -4+ -4+ -4+ -+ -t+ -+ -4t -4 -F -4+ -F -4+ --4 
2 —2 "+shBUD31 = = 
ro 2 a MQ eK 3 OL 5 2 
Intron-retained pre-mRNA 3 5 8 8 Q Ss 3 5 & 2 a 5 2 & 3 ral % x< 8 & 
(targets for degradation) >G4_20 = g Sacagnt = 246 § t tr a2 Pe 7 
ee f aq ia x <= 2 
c _ 20 d33 x fe 
3 3 SABUDS y= eS Hae SS eS aS SP ae 
re NS = -shBUD31 = 2 r =, = 
® 1.5 a = — a 
Q =+shBUD31 ° S = 
xt a1 ? | al a 
z2 fe} ran 7 
Z 1.0 2 9 Seer a a 
p= 2 § © ; ve a 
< S & 6 SP &. FF oo i 
= 0.5 SY 6 Se 3? we oe “s & Lan fa 
am < FO OS = 
a eo Re ‘S$ & 
a se Va wr & e [o) 
Control MYC Se od & 
& CRS 
wy WW 
=k 
wv 9 


Figure 3 | Combined spliceosomal perturbation and MYC hyperactivation 
inhibits pre-mRNA maturation. a, Model of MYC-spliceosome synthetic 
lethality. b, Difference in cellular poly(A)* RNA in HMECs after actinomycin 
D (AD) treatment (n = 3 biological replicates, two-tailed Student’s t-test). 

c, Steady-state poly(A)* RNA levels per cell (10-4 ng) (n = 4 biological 
replicates, two-tailed Student’s t-test). d, Gene Ontology (GO) enrichment of 


386 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


intron-retained genes in the MYC-hyperactive and BUD31-depleted state. 
Dashed line indicates P = 0.05. e, f, In MYC-hyperactive BUD31 shRNA cells, 
representative genes display increased IR (e) and decreased steady-state 
RNA levels (f) after BUD31 knockdown in MYC-hyperactivated cells. Bar 
colours represent GO terms, see legend in Extended Data Fig. 6. Data are 
mean + s.em. **P< 0.01, ***P < 0.001. NS, not significant. 


©2015 Macmillan Publishers Limited. All rights reserved 


of poly(A) ~ RNA revealed that the combination of MYC hyperactiva- 
tion and BUD31 knockdown led to a substantially greater decrease 
(60%) in poly(A)* RNA after actinomycin D treatment (Extended 
Data Fig. 5a). Similar trends were observed in nuclear RNA pools, 
consistent with defects in nuclear pre-mRNA maturation (Extended 
Data Fig. 5b). Consistent with this decrease in pre-mRNA maturation 
and stability, cells containing oncogenic MYC and BUD31 knockdown 
exhibited significantly lower (54%) steady-state levels of poly(A)" RNA 
(Fig. 3c). Collectively, these results indicate that MYC hyperactivation 
increases cellular pre-mRNA synthesis, and inhibition of the spliceo- 
some reduces the cellular capacity to process this pre-mRNA burden. 
The result of this MYC-hyperactivated and spliceosome-hypomorphic 
state is enhanced intron retention, decreased mRNA maturation and 
stability, and a significant loss of steady-state cellular mRNA. 

Gene Ontology analysis of genes with the most significant intron 
retention in the combined MYC-hyperactive and BUD31-depleted 
state (2,848 out of 6,816 genes analysed for IR) suggests that many 
essential processes and subcellular structures were affected, including 
gene expression, DNA replication and repair, the mitotic spindle, 
unfolded protein response, and RNA splicing (Fig. 3d). Many genes 
participating in these essential cell processes exhibited increased IR in 
the combined MYC-hyperactive and BUD31-knockdown state (rep- 
resentative genes in Fig. 3e) and a concomitant decrease in RNA levels, 
consistent with a defect in maturation and stability of IR-containing 
transcripts (Fig. 3f). Consistent with their role in crucial cellular pro- 
cesses, knockdown of these genes reduced cell number by 0.7—-4.2-fold 
(as quantified by barcode-tag abundance, Extended Data Fig. 6). 
Together, these data are consistent with the hypothesis that the com- 
bination of oncogenic MYC and spliceosome inhibition leads to wide- 
spread loss of mRNA integrity, resulting in the deregulation of many 
essential genes and processes instead of a single pathway. 

Because oncogenic MYC significantly increases the sensitivity of 
HMECs to inhibition of the spliceosome, we proposed that MYC-driven 
cancers may be hyperdependent on core spliceosomal function to sup- 
port their survival. We queried whether MYC-driven breast cancer 
cell lines exhibit increased sensitivity to knockdown of core spliceoso- 
mal genes. Recently, we conducted genome-wide RNAi screens in a 
panel of 72 breast cancer and immortalized cell lines for genes affecting 
cell viability (Fig. 4a) (R.M., A.S. and B.G.N., manuscript in prepara- 


LETTER 


tion). From this data set, we tested for a correlation between 
MYC-dependency (as indicated by sensitivity to MYC shRNAs) and 
dependency on the spliceosome (as indicated by sensitivity to shRNAs 
targeting spliceosome components in the shRNA library), or on 100,000 
randomly drawn gene sets. Notably, MYC-dependent breast cancer cell 
lines were significantly more sensitive to shRNAs targeting the core 
spliceosome (Fig. 4b, P= 0.005). The correlation between MYC- 
dependency and spliceosome-dependency was significantly pronounced 
in the basal breast cancer lines (Fig. 4c, P< 0.00001), an aggressive 
molecular subtype of breast cancer frequently driven by MYC. 

Triple-negative breast cancers are commonly driven by MYC, and 
exhibit an aggressive, highly metastatic clinical course. To determine 
whether MYC-driven triple-negative breast cancers are dependent on 
spliceosomal integrity for their tumorigenic and metastatic proclivity, 
we tested the effects of genetic and pharmacological inhibition of the 
spliceosome on MYC-dependent and metastatic triple-negative breast 
cancer (TNBC) models. Inducible BUD31 shRNA reduced cell viability 
and increased apoptosis in MYC-dependent TNBC cells in vitro 
(Fig. 4d, e and Extended Data Fig. 7a, b). Similar to MYC-ER HMECs, 
MYC protein levels remained unchanged during BUD31 depletion in 
these MYC-dependent cancer cell lines (Extended Data Fig. 8a, b), sug- 
gesting that the apoptotic response was not due to loss of the driver 
oncogene (MYC). To assess the effect of spliceosomal perturbation on 
tumour growth, we established a pooled competition assay that uses 
shRNA-associated barcodes to detect changes in tumour cell fitness 
(Extended Data Fig. 9). In the metastatic TNBC cell line MDA-MB- 
231-LM2 (LM2)"”, inducible MYC-shRNA-expressing cells dropped 
out of the tumour population, confirming the MYC dependency of this 
TNBC model (Fig. 4f). Similarly, tumour cells containing BUD31 or 
SF3B1 shRNA dropped out of the tumour population (Fig. 4f). 
Tumorigenicity of another MYC-dependent TNBC model (SUM159) 
was similarly impaired by BUD31 depletion (Extended Data Fig. 7c, d). 
These data suggest that the loss of BUD31 or other core spliceosomal 
factors inhibits MYC-dependent breast cancer growth in vivo. 

Because MYC-driven breast cancers are prone to metastasize to 
visceral organs including the lungs”, we tested whether perturbation 
of spliceosome function affected metastatic expansion of MYC- 
dependent LM2 cells. As shown in Fig. 4g, metastatic cells with 
BUD31 knockdown were significantly depleted from the population 


a In vitro culture — Hybridization b c d . tok e ¢ 400 * 
All breast lines (n = 72 Basal breast lines (n = 32) 8 100 go 
—= ( ) = E & 2 300 
vyAg Cell line 1 a 4 4 2 15 28 
yAvgal Bye shRNA- S 2 = 50 = 3 200 
= = & ms 
Olean —— MYClethal e = 25 2 00 
NN vAg Cell line 2 & 8 Core spliceosome = 52 
pew og 3 factors (P < 1x10°) 3 (OO 4 x So 
Dae PyA 2 i TL QU! au0oe 034 934 
4 E ol Ugh BOP’ 8 eg BOO* BU? 
Genome-wide Cell line 3 -25 0 25 50 0 50 100 
shRNA library : MYC-siMEM score MYC-siMEM score - + shBUD31 = + _ shBUD31 
shRNA-spliceosome MYC d d 
= lethal iMG go-dependency) (MYC co-dependency) (i ee| BUD31 Cl. caspase-3 
Cell line 72 tC ae Vinculin [= | Vineulin 
Primary tumour . 
f Primary tumour g Lung metastasis h are I Lung metastasis 
401 ee 10'= ks 400— is 10° = 
3 o 4 a S & 5 Cf] 
3 Lt 4.1x 3 io +h 133.5x = 300- . = ost 23. 
Loa — 12.6x 12.3x ie) | © : fs} 
§ 2 10° at v 8841 . ‘3 5 200 Ee a 5 10-3 - 
co A = 7 . iS) E| cy 
ge a — Beg 28 "a" 43 8 Saaee 
33 #26... Bega. &§ 107 £S 100 sag 10-Y a 
E 1071 aS v S8 EI Detection 5 3 ia E q a Sane 
E t 200] 
§ aA 5 10 4 reshol 5 o4 aA @ 10" 4 aA 
10? 10 100 = 409 
shControl shMYC shBUD31_ shSF3B1 shControl shBUD31 Vehicle SD6 Vehicle SD6 


Figure 4 | In vivo perturbation of spliceosomal activity impairs MYC- 
dependent breast tumours and metastases. a, Schematic for identifying 
genetic co-dependencies in breast cancer lines. b, c, MYC-siMEM (mixed-effect 
model) score, which represents the correlation between cell line sensitivity 

to MYC shRNAs and sensitivity to shRNAs targeting random gene sets 

(n = 100,000; see Methods), is plotted against frequency of gene sets. Increasing 
MYC-siMEM values denote higher correlation with MYC-dependency. Red 
arrows indicate MYC-siMEM scores for spliceosome-dependency in all 
breast cancer lines (n = 72) (b) and the basal breast cancer subset (n = 32) (c). 
P value by bootstrap analysis for both. d, e, MDA-MB-231-LM2 cells with 
shBUD31 display diminished BUD31 protein levels (d, bottom), decreased cell 


numbers (d, top) (mean = s.e.m., 1 = 8 technical replicates, two-tailed 
Student’s f-test), and increased caspase-3 cleavage (e, bottom) and caspase-3/7 
luminescence (e, top) (mean = s.e.m., n = 3 technical replicates, two-tailed 
Student’s t-test). f, g, Barcode-shRNA abundance of LM2 cells within primary 
tumours (f) or pulmonary metastases (g). Mean barcode abundance in each 
tumour or lung is normalized to the injected cell population (n = 3 technical 
replicates, two-tailed Student’s t-test). h, Change in LM2 tumour growth 
after 2 weeks of vehicle (n = 13) or SD6 (n = 10) infusion. Bars indicate mean 
values (two-tailed Student’s f-test). i, Pulmonary LM2 bioluminescence after 
10-day infusion with vehicle (n = 7) or SD6 (n = 6). Bars indicate median 
values (Mann-Whitney test). *P < 0.05, **P < 0.01, ***P < 0.001. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 387 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


(>133.5-fold change), with most doxycycline (dox)-positive tumours 
containing BUD31 shRNA barcodes below the level of detection. These 
data suggest that BUD31 and the spliceosome are essential for MYC- 
dependent breast tumorigenicity and metastatic expansion in vivo. 

Next, we tested whether pharmacological inhibition of the spliceo- 
some also impaired tumorigenic and metastatic potential of MYC- 
dependent TNBC cells. Compared to MYC-normal cell lines (half- 
maximal inhibitory concentration (ICs9) value ~ 53 nM), MYC-driven 
cancer cells were significantly more sensitive (ICs9 value ~ 4 nM) to the 
spliceosome inhibitor SD6 in vitro (Extended Data Fig. 10a). Similarly, 
SD6 suppressed the proliferation of a MYC-driven B-cell model* 
(Extended Data Fig. 10b), suggesting that oncogenic MYC may confer 
hyperdependency on the spliceosome in many epigenetic backgrounds 
and cancer types. In primary LM2 tumour xenografts, SD6 potently 
restrained tumour growth with no toxicities in any organ system exam- 
ined, suggesting that splicing is essential for the tumorigenicity of these 
MYC-dependent breast cancer cells (Fig. 4h). Similarly, SD6 impaired 
lung metastatic expansion in experimental metastasis assays (Fig. 4i), 
and extended progression-free survival (Extended Data Fig. 10c). Col- 
lectively, these data suggest that MYC-driven breast cancers depend on 
spliceosomal integrity for their tumorigenic and metastatic progression. 

Altogether, the results suggest that MYC-driven breast cancers contain 
an enhanced dependency on the core spliceosome. Recent studies have 
shown that MYC regulates splicing of select genes via induction of 
alternative splicing factors or components of the core spliceosome’®™’. 
This study suggests that MYC may induce a much broader stress on 
splicing via its ability to increase global pre-mRNA synthesis. Recently, 
there has been considerable investigation into how MYC elicits a wide- 
spread increase in mRNA synthesis across the transcriptome**”?”. 
Notably, either direct or indirect mechanisms of increased pre-mRNA 
synthesis elicited by MYC could lead to an enhanced dependency on the 
spliceosome, and thus make MYC-driven cancers candidates for spliceo- 
some-based therapies. These observations provoke the important ques- 
tion of whether MYC-induced amplification of mRNA synthesis may 
also generate vulnerabilities in other aspects of RNA processing (such as 
mRNA capping, polyadenylation or mRNA export) and downstream 
protein biosynthesis* in MYC-driven cancers. Notably, the spliceosome 
may be a target of both oncogene addiction and oncogenic stress. 
Components of the U2 snRNP, such as SF3B1 and U2AFI, contain 
frequent and recurrent somatic mutations that cluster in an evolutiona- 
rily conserved domain, suggestive of oncogenic function’*”*. On the basis 
of such putative oncogenic functions, the spliceosome has been proposed 
as a target for classical oncogene addiction, in which spliceosome mutant 
tumours may be addicted to the oncogenic functions of spliceosome 
mutants and thus sensitive to spliceosome inhibitors. However, this study 
and others”’”* have shown that inhibition of spliceosome components is 
deleterious in cancer cell line models that lack spliceosome mutations, 
suggesting that other drivers of cancer (such as MYC) are determinants of 
sensitivity to spliceosome inhibitors. Because oncogenic MYC is known 
to drive several pro-tumorigenic programs that include rewiring of bio- 
synthetic pathways”, our model provokes the important hypothesis 
that cellular processes (such as splicing) that enable cancer cells to tolerate 
such widespread shifts in macromolecular synthesis may provide entry 
points for anti-cancer therapies. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 27 August 2014; accepted 24 July 2015. 
Published online 2 September 2015. 


1. Eilers, M. & Eisenman, R.N. Myc’s broad reach. Genes Dev. 22, 2755-2766 (2008). 

2. Sabo, A. & Amati, B. Genome recognition by MYC. Cold Spring Harb. Perspect. Med. 
4, a014191 (2014). 

3. Dang, C.V. MYC, metabolism, cell growth, and tumorigenesis. Cold Spring Harb. 
Perspect. Med. 3, a014217 (2013). 

4. Lin, C.Y. etal. Transcriptional amplification in tumor cells with elevated c-Myc. Cell 
151, 56-67 (2012). 


388 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


5. ie, Z. et al. c-Myc is a universal amplifier of expressed genes in lymphocytes and 

embryonic stem cells. Ce// 151, 68-79 (2012). 

6. Ruggero, D. The role of Myc-induced protein synthesis in cancer. Cancer Res. 69, 

8839-8843 (2009). 

7. Barna, M. et al. Suppression of Myc oncogenic activity by ribosomal protein 

haploinsufficiency. Nature 456, 971-975 (2008). 

8. Luo, J., Solimini, N. L. & Elledge, S. J. Principles of cancer therapy: oncogene and 

non-oncogene addiction. Cel! 136, 823-837 (2009). 

9. Kessler, J. D. et al. ASUMOylation-dependent transcriptional subprogram is 

required for Myc-driven tumorigenesis. Science 335, 348-353 (2012). 

10. Masciadri, B. et a/. Characterization of the BUD31 gene of Saccharomyces 

cerevisiae. Biochem. Biophys. Res. Commun. 320, 1342-1350 (2004). 

11. Wahl, M.C., Will, C. L. & Luhrmann, R. The spliceosome: design principles of a 

dynamic RNP machine. Cel// 136, 701-718 (2009). 

12. Bonnal, S., Vigevani, L. & Valcarcel, J. The spliceosome as a target of novel 

antitumour drugs. Nature Rev. Drug Discov. 11, 847-859 (2012). 

13. Lagisetti, C. et a, Optimization of antitumor modulators of pre-mRNA splicing. 

J. Med. Chem. 56, 10033-10044 (2013). 

14. Kanazawa, S., Soucek, L., Evan, G., Okamoto, T. & Peterlin, B. M. c-Myc recruits 

P-TEFb for transcription, cellular proliferation and apoptosis. Oncogene 22, 

5707-5711 (2003). 

15. Rahl, P. B. etal. c-Myc regulates transcriptional pause release. Ce// 141, 432-445 

(2010). 

16. Koh, C. M. et al. MYC regulates the core pre-mRNA splicing machinery as an 

essential step in lymphomagenesis. Nature 523, 96-100 (2015). 

17. Garneau, N. L., Wilusz, J. & Wilusz, C.J. The highways and byways of mRNA decay. 

Nature Rev. Mol. Cell Biol. 8, 113-126 (2007). 

18. Mezquita, P., Parghi, S. S., Brandvold, K. A. & Ruddell, A. Myc regulates VEGF 

production in B cells by stimulating initiation of VEGF mRNAtranslation. Oncogene 

24, 889-901 (2005). 

19. Minn, A. J. et al. Distinct organ-specific metastatic potential of individual breast 
cancer cells and primary tumors. J. Clin. Invest. 115, 44-55 (2005). 

20. Di Cosimo, S. & Baselga, J. Management of breast cancer with targeted agents: 
importance of heterogeneity. Nature Rev. Clin. Oncol. 7, 139-147 (2010). 

21. David, C. J., Chen, M., Assanah, M., Canoll, P. & Manley, J. L. HnRNP proteins 
controlled by c-Myc deregulate pyruvate kinase mRNA splicing in cancer. Nature 
463, 364-368 (2010). 

22. Sabo, A. etal. Selective transcriptional regulation by Myc in cellular growth control 
and lymphomagenesis. Nature 511, 488-492 (2014). 

23. Walz, S. etal. Activation and repression by oncogenic MYC shape tumour-specific 
gene expression profiles. Nature 511, 483-487 (2014). 

24. Lin, C.J. etal. Targeting synthetic lethal interactions between Myc and the elF4F 
complex impedes tumorigenesis. Cell Rep. 1, 325-333 (2012). 

25. Graubert, T. A. et a/. Recurrent mutations in the U2AF1 splicing factor in 
myelodysplastic syndromes. Nature Genet. 44, 53-57 (2011). 

26. Papaemmanuil, E. et al. Somatic SF3B1 mutation in myelodysplasia with ring 
sideroblasts. N. Engl. J. Med. 365, 1384-1395 (2011). 

27. Hubert, C. G. et al. Genome-wide RNAi screens in human brain tumor isolates 
reveal a novel viability requirement for PHF5A. Genes Dev. 27, 1032-1045 (2013). 

28. Adler, A. S. et a/. An integrative analysis of colon cancer identifies an essential 
function for PRPF6 in tumor growth. Genes Dev. 28, 1068-1084 (2014). 

29. Cunningham, J. T., Moreno, M. V., Lodi, A., Ronen, S. M. & Ruggero, D. Protein and 
nucleotide biosynthesis are coupled by a single rate-limiting enzyme, PRPS2, to 
drive cancer. Cell 157, 1088-1103 (2014). 

30. Liu, Y. C. et al. Global regulation of nucleotide biosynthetic genes by c-Myc. PLoS 
ONE 3, e2722 (2008). 


Acknowledgements We would like to thank J. Rosen, S. Butler, K. Neugebauer, 

M. Moore, S. Elledge, T. Davoli, members of T.F.W., C.AS. and T.A.C. laboratories for 
comments, and P. Yu for bioinformatics support. The authors also acknowledge the join 
participation by Adrienne Helis Melvin Medical Research Foundation through its direct 
engagement in the continuous active conduct of medical research in conjunction with 
Baylor College of Medicine for cancer research. The Dan L. Duncan Cancer Center 
Shared Resources was supported by the NCI P30CA125123 Center Grant and provided 
technical assistance including Cell-Based Assay Screening Service (D. Liu), Genomic and 
RNA Profiling Resource (L. White), Biostatistics & Informatics Shared Resource (S. 
Hilsenbeck), Cytometry and Cell Sorting UJ. Sederstrom; P30 Al036211 and S10 
RRO24574), and the Proteomics and Metabolomics Core Facility (Cancer Prevention 
and Research Institute of Texas, RP12009). T.Y.-T.H. was supported by NIH pre-doctora 
fellowship (NCI 1F30CA180447) and CPRIT training grant (RP101499). M.O. and RJ.B. 
were supported by The Gillson Longenbaugh Foundation. R.J.B. was supported by Alex’s 
Lemonade Stand Foundation. T.F.W. was supported by CPRIT (RP120583), the Susan 
G. Komen for the Cure (KGO90355), the NIH (1RO1CA178039-01 and U54-CA149196) 
and the DOD Breast Cancer Research Program (BC120604). 


Author Contributions T.Y.-T.H., N.J.N., R.M., C.S.B., G.V.E,, T.S., SJ.K., S.T., K.LK,, J.D.H., 
KS., RJ.B.,S.0.S., AJ., C.L.and M.O. performed the experiments. L.M.S.,A.S., R.D.-V., AR. 
and C.AS. performed statistical analyses. I.G., S.YJ., J.R.N., X.H.-F.Z, T.A.C., T.R.W., 

B.G.N., C.A.S. and T.F.W. devised or supervised experiments. T.Y.-T.H. and T.F.W. wrote 
the manuscript. 


Author Information RNA-seq data sets have been deposited in the NCBI Gene 
Expression Omnibus (GEO) under accession number GSE66182. Reprints and 
permissions information is available at www.nature.com/reprints. The authors declare 
no competing financial interests. Readers are welcome to comment on the online 
version of the paper. Correspondence and requests for materials should be addressed 
to T.F.W. (thomasw@bcm.edu). 


©2015 Macmillan Publishers Limited. All rights reserved 


METHODS 

Vectors and virus production. Commercially available pGIPZ shRNAs targeting 
BUD31 (V2LHS_47771 and V2LHS_47770), EFTUD2 (V2LHS_28167), SF3B1 
(V3LHS_ 397872), SNRPF (V2LHS_276933) and U2AF1 (V2LHS_84677) were 
obtained from Open Biosystems. shRNAs targeting the 3’ UTR region of 
BUD31 were designed using the BiopredSI and RNAi Codex algorithms 
(shRNA sequence 5’-TGCTGTTGACAGTGAGCGCCGCTGTCTATCAGCTG 
TGATTTAGTGAAGCCACAGATGTAAATCACAGCTGATAGACAGCGATG 
CCTACTGCCTCGGA-3’). For inducible RNAi experiments, shRNAs were sub- 
cloned into the pINDUCER dox-inducible lentiviral expression system*. 
Lentiviruses and retroviruses were produced by transiently transfecting shRNA 
or cDNA constructs using Mirus Bio TransIT transfection protocols into 293T 
cells and collecting viral supernatants 48 h after transfection. 

Cell culture. HMECs expressing hTERT and inducible MYC-ER (MYC-ER 
HMECs), F7 epithelial cells and human mammary epithelial HME] cells were 
cultured in mammary epithelial growth medium (MEGM, Lonza). 293T cells, 
HeLa cells and MDA-MB-231-LM2 human breast cancer cells were cultured in 
DMEM (Gibco) supplemented with 10% FBS. SUM159 human breast cancer cells 
were cultured in F12 (Gibco) media supplemented with 5% FBS, 10 mM HEPES 
(Gibco), 5 ug ml ~ insulin (Invitrogen), and 1 pg ml‘ hydrocortisone. The P493- 
6 human B-cell lymphoma cell line was cultured in RPMI-1640 supplemented 
with 10% FBS (Clonotech) and 1% GlutaMAX (Invitrogen). All cell lines were 
incubated at 37 °C and 5% CO,. Cell lines were obtained from ATCC, and all cell 
lines are tested yearly for mycoplasma contamination. Stable cell lines expressing 
shRNAs or cDNAs were generated by lentiviral or retroviral transduction in the 
presence of 8 yg ml‘ polybrene followed by selection with appropriate antibiotic 
resistance markers. 

Cell proliferation assays. MYC-ER HMECs were infected with pINDUCER- 
shRNA viruses at a multiplicity of infection (MOI) of 1.3-1.5, and transduced 
cells were seeded at a density of 3,000 onto 96-well black plates (Corning). MYC- 
ER HMECs with pINDUCER-shBUD31-3'UTR were treated with 300nM 
4-hydroxytamoxifen (4-OHT) to induce MYC hyperactivation, and with 32 ng 
ml ! dox (Sigma) to induce shBUD31 expression. SUM159 and MDA-MB-231- 
LM2 (LM2) cells were infected with pINDUCER-shBUD3] virus (targeting the 3’ 
UTR and coding region, respectively) at an MOI of 1.5, and seeded at a density of 
1,000 and 2,000, respectively. Expression of shBUD31 in LM2 and SUM159 cells 
was induced with 1 jig ml” ' dox. HMECs and breast cancer cells were re-fed every 
3-4 days until cells reached confluence. At confluence, cells were fixed in 4% 
paraformaldehyde, and nuclei were stained with Hoeschst3321 (1:1,000, Life 
Technologies). Nuclei were imaged and counted using the Celigo Imaging Cell 
Cytometer (Brooks). 

For clonogenic assays, breast cancer or immortalized epithelial cells were seeded 
at low density (between 500 and 2,000 cells per plate, depending on the cell line) 
into 6-cm plates, four replicates per treatment group. MYC-ER HMECs with 
pINDUCER-shBUD31-3'UTR were treated with 8ngml~' dox and 300nM 
4-OHT, and MYC-ER HMEGCs treated with 10 or 20 nM SD6 were also cultured 
with 200nM 4-OHT. Cells were re-fed every 4 days until colonies were mac- 
roscopic. The colonies were stained using Coomassie brilliant blue. Macroscopic 
colonies were quantified and normalized to vehicle-treated cells for each cell line. 

For the P493-6 cell line with pmyc-tet construct”, MYC was reduced by treating 
cells with 0.1 g ml tetracycline (Sigma) for 72 h. MYC was induced by washing 
P493-6 cells with PBS twice, then culturing cells in RPMI-1640 medium with 10% 
Tet System Approved FBS (Clontech) and 1% GlutaMAX. P493-6 cells were 
treated with or without 100 nM SD6 and with or without 0.1 pg ml’ tetracycline 
for 4 days. 

Immunoprecipitation and mass spectrometry. HeLa cells transduced with len- 
tivirus encoding BUD31 cDNA and non-transduced HeLa cells were collected, 
and nuclear extracts as well as whole-cell lysates were collected as described 
previously*’. Lysates were treated with RNase A (500g ml‘) for 1h on ice. 
For immunoprecipitations, nuclear and whole-cell extracts were ultracentrifuged 
at 100,000g, and incubated with 25 jig M2 Flag antibody (Sigma) for 1 h, followed 
by ultracentrifugation and incubation with Sepharaose-CL4B Protein A beads (GE 
Healthcare). Beads were washed with NTN (50 mM Tris-Cl, pH 8.0, 150 mM NaCl 
and 0.5% NP-40), and immunocomplexes were resuspended in 1X Laemmli 
buffer and resolved on pre-cast 4-20% Novex Tris-Glycine gels (Life Techno- 
logies). Gels were minimally stained with Coomassie brilliant blue, cut into 8 
molecular mass ranges, and digested with trypsin. Immunocomplexes were iden- 
tified on a Thermo Fisher LTQ mass spectrometer, and data processing was 
performed as previously described’. 

Enrichment analysis. Human GO annotation file (gene_association.goa_human. 
gz) was downloaded from http://geneontology.org/GO.downloads.annotations. 
shtml containing a GOC Validation date of 2 September 2013. Enrichment analysis 
was performed to consider the content of (1) BUD31-associated proteins, or 


LETTER 


(2) genes with enhanced IR. Gene symbols annotated to BUD31-associated pro- 
teins were cross tabulated against all Gene Ontology annotations. Genes with 
enhanced IR were cross tabulated against the subset of Gene Ontology annotations 
for genes considered in this analysis. We used Fisher’s exact test to determine P 
values for the proportion of genes overlapping each annotation set. 

BiFC. BUD31 was cloned into the pQCXIN-N-YFP fusion vector, in which the 
BUD31 N terminus was fused to the N-terminal domain (residues 1-155) of 
Venus yellow fluorescent protein (YFP). Human splicing factor cDNAs were 
individually recombined into retroviral vectors with C-terminal Venus YFP (resi- 
dues 156-239) tags at the N-terminal ends. SUM159 breast cancer cells were 
transduced with these bait and prey BiFC retroviruses, and cellular fluorescence 
was analysed by flow cytometry in triplicate. 

BUD31 mutagenesis. Wild-type and mutant BUD31 cDNAs were generated by 
gene synthesis (IDT DNA) and recombined into the pQCXIN-N-YFP fusion 
vector. Mutant BUD31 consisted of substituting human BUD31 amino acid resi- 
dues 105-114 with an equivalent number of glycine residues (codon GGA). 

In vitro competition assay. MYC-dependent SUM159 breast cancer cells with 
pINDUCER-shBUD31-3’ UTR were transduced with viruses containing wild-type 
or mutant BUD31 or negative control cDNA recombined into pQCXIN-N-YFP 
vectors. Infected, GFP* cells are mixed at an 80:20 ratio with non-transduced, 
GFP parental cells and seeded into 96-well plates and treated either with or 
without dox (1 4gml~’). At confluence, cells were passaged 1:10 and processed 
for flow cytometry. The in vitro competition assay was continued for two passages. 
Immunoblotting. Cells were lysed in 1X SDS sample buffer (62.5 mM Tris-HCl, 
pH 6.8, 10% glycerol, 2% SDS, 2.5% B-mercaptoethanol) and heated at 95°C for 
12min. The following antibodies were used for western blotting: Flag (Sigma, 
A8592), BUD31 (ProteinTech, 11798-1-AP), SF3B1 (Bethyl, A300-996A), Prp19 
(Bethyl, A300-101A), U2AF1 (Bethyl, A302-079A), SF3A1 (Bethyl, A301-603A), 
EFTUD2 (Bethyl, A300-957A), SNRPF (Abcam, 154870), HER2 (Millipore, 06- 
562), EGFR (Cell Signaling, 2232), cleaved caspase-3 (Cell Signaling, 9664), RPS8 
(Assay Biotechnology, R12-3466), EIF2S1 (Abgent, AP13469s), eIF3I (p36) 
(Biolegend, 646701) and c-Myc (D84C12) (Cell Signaling, 5605). Vinculin 
(Sigma, V9131) and Ran (BD Biosciences, 610340) were used as loading controls. 
In vitro transcription. Uniformly **P-UTP radiolabelled MINX pre-mRNA was 
in vitro transcribed from a BamHI-digested plasmid*’, DNasel (Ambion) treated 
and gel-isolated on a 8 M urea 6% polyacrylamide gel. 

In vitro splicing. HeLa nuclear extracts used for in vitro splicing assays were made 
as described previously” from HeLa cells transduced with an inducible BUD31- 
targeting shRNA and grown in the presence or absence of 1 pg ml * dox. Splicing 
reactions of 15 yl contained: 8nM RNA substrate, 0.8 mM DTT, 1.7 mM mag- 
nesium acetate, 1.7 mM ATP, 17 mM phospho-creatine, 20 mM glycine, 1 U ul? 
RNasin Plus (Promega), 3.7% PVA and 50 pig of HeLa nuclear extracts. Splicing 
reactions were incubated for indicated time points at 30 °C and stopped by diges- 
tion with proteinase K (Ambion) for 30 min at 45 °C followed by RNA purifica- 
tion. RNA purified from splicing reactions was electrophoresed on 8 M urea 8% 
polyacrylamide gels, then exposed to a phosphorimager screen (Typhoon Trio 
phosphorimager, GE Healthcare). Alternatively, RNA purified from in vitro splic- 
ing reactions was added to RT-PCR reactions as previously described** with 
primers in exons 1 and 2 of MINX (forward: 5’-CGGAATTCGAGCTCGCCC-3' 
and reverse: 5’-GGATCCCCACTGGAAAGA-3’). PCR products were run on 6% 
non-denaturing polyacrylamide gels and visualized after staining with ethidium 
bromide. 

Spliceosome complex formation assay. In vitro splicing reactions were carried 
out as described above, placed on ice, and heparin was added to a final concen- 
tration of 2 1g pl '. Reactions were incubated in the presence of heparin at 30 °C 
for 5 min and immediately loaded onto 0.75-mm non-denaturing 4% acrylamide- 
0.4% agarose composite gels. Gels were run at 250 V at room temperature in 1X 
tris-glycine running buffer for 3 h, then placed on Whatman paper and exposed to 
a phosphorimager cassette. 

RNA isolation and qRT-PCR. RNA isolation was performed with the RNeasy 
Mini kit (Qiagen). Reverse transcription was performed using the High Capacity 
RNA-to-cDNA Master Mix (Applied Biosystems), and qPCR was performed 
using SYBR Green Master Mix (Applied Biosystems). The following primers were 
used: BUD31 forward: 5'-ACCAACTTCGGGACGAACTG-3’, reverse: 5'-CGG 
CCCACTTCCAGCTT-3'; EFTUD2 forward: 5'-CCTTCGTGTTGTCAGAGA 
GTGTCT-3’, reverse: 5'’-TGGGTTGGAGGTTGGTGAGT-3’; SF3B1 forward: 
5'-GTGGACAAAATGGCGAAGAT-3’, reverse: 5’-GAGCTTCATCAAGAGCT 
GCC-3'; SNRPF forward: 5'-GGGAATGGAGTACAAGGGCT-3’, reverse: 5'-CC 
CAGATGTCCAGACAAAGC-3’; U2AF1 forward: 5’-ACGTTTAGCCAGACCA 
TTGC-3’, reverse: 5'-TGTTCCTGCATCTCCACATC-3'; GAPDH forward: 5’- 
CCTCCCGCTTCGCTCTCT-3’, reverse: 5'-TGGCGACGCAAAAGAAGAT-3’. 

RNA-seq. pINDUCER11-shBUD31-3' UTR-infected MYC-ER HMECs were cul- 
tured for 72h with/without 16 ng ml! dox, and for 48h with/without 300 nM 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


tamoxifen in triplicates. Total RNA was isolated using the RNeasy kit (Qiagen). 
RNA samples were rRNA depleted, and NGS libraries were constructed and 
sequenced as 75 bp paired-end reads by lumina HiSeq 2000. 

Quality assessment of RNA-seq. RNA-seq NGS reads quality was evaluated 
using FastQC application (http://www.bioinformatics.babraham.ac.uk/projects/ 
fastqc/). 

Alignment of RNA-seq data. RNA-seq NGS reads were mapped using STAR 
RNASeq aligner (version 2.3.1). To improve mapping accuracy, the database 
file of splice junctions (http://it-collab01.cshl.edu/shares/gingeraslab/www-data/ 
dobin/STAR/STARgenomes/GENCODE/Old/gencode.v14.annotation.gtf.sjdb) 
was supplied at the genome index generation step with command line option- 
sjdbOverhang 7, together with http://it-collab01.cshl.edu/shares/gingeraslab/www- 
data/dobin/STAR/STARgenomes/GENCODE/Old/hg19_Gencodel4.overhang75/ 
and default parameters. Duplicate reads were marked with the MarkDuplicates 
function of the Picard-tools software package (http://picard.sourceforge.net; version 
1.107) using default settings. 

Intron-exon junction definition. To prevent confounding effects in our analysis 
of IR within HMECs, we confined our analyses to exons in non-overlapping 
genes that are included within all isoforms of a given gene (75,623 junctions in 
6,861 genes). 

Intron-exon junctions were obtained using the University of California 
Santa Cruz Genome Browser ‘knownGene’ table (downloaded 4 June 2014). 
Constitutive junctions were defined as junctions that (1) appear in each transcript 
annotated to a given gene symbol, (2) do not overlap with any transcript annotated 
to a different gene symbol, and (3) do not mark the start or stop of a transcript. 
Junction IR calculation. Because analysis of intronic reads may be influenced by 
the presence of stable RNAs within introns and/or spliced lariats, we calculated 
junction IR as the ratio of exon-intron reads to exon-exon reads, restricting the 
analysis to reads directly spanning exon-intron or exon-exon junction sequences. 
We used R together with the Rsamtools package to calculate IR. In brief, for each 
intron-exon junction, we extracted all non-duplicate reads overlapping this junc- 
tion. Next, we assigned these reads into two categories: (1) ‘intronic’ if the read 
mapped to at least the first base of the intron, (2) ‘exonic’ if none of the bases of the 
read mapped to the first base of the intron and at least one base mapped to a 
subsequent exon. We counted the total number of reads assigned into each cat- 
egory for each junction. IR was calculated as: 


Gii+l 
E,i+1 


IR, = log, 


in which IR;, represents the IR score for junction j in sample i, and J; and Ej, refer 
to the count of reads classified as intronic and exonic for junction j in sample i, 
respectively. To avoid ratios with 0 in the denominator, we added 1 to each of these 
counts. The scripts used to conduct this calculation are available on request. 
We restricted all following analyses to intron—exon junctions with an average of 
at least 25 total (intronic and exonic) reads in the control and MYC-hyperactivated 
samples. 

Gene IR calculation. For the cumulative distribution analyses, the mean IR score 
for all junctions in a gene were averaged. 

Gene annotation. A custom gene annotation file was generated to correspond to 
the set of intron—exon junctions considered in the IR analysis. In brief, exons were 
defined as: (1) an exon flanked by two junctions annotated to the same symbol, 
and (2) an exonic region flanked by one junction and conserved across all tran- 
scripts annotated to the same symbol. 

Statistical analysis of RNA-seq. Statistical analyses were performed using the 
open source statistical programming environment ‘R’. Empirical cumulative dis- 
tributions of IR scores were compared using two-sided Kolmogorov-Smirnov test 
and Wilcoxon test. 

Permutation-based test of significance. The significance of the difference of 
empirical cumulative distributions of junction-level IR scores was evaluated using 
a permutation-based approach. The null hypothesis was that splicing perturba- 
tions had no effect on IR changes in the MYC-normal and MYC-hyperactivated 
states. To model this null hypothesis, the treatment information was blinded to the 
assignments of MYC activity. We generated a third control sample by randomly 
selecting half of the junctions from samples 1-control and 2-control. Control and 
MYC samples were grouped as 6 ‘normal’ samples, and LowBUD31 and 
Myc_LowBUD31 samples were grouped as ‘splicing perturbed’ samples. Next, 
we comprehensively generated all possible normal and splicing perturbed con- 
trasts by subtracting the average of each junction IR score of three normal samples 
from that of the splicing perturbed group. The empirical distribution of all possible 
double differences was generated and used to assign significance to the original 
observations. An analogous approach was used to evaluate the difference of empir- 
ical cumulative distributions of gene-level IR scores. 


qPCR IR validation assay. Amplification reactions were prepared using SYBR 
Select Master Mix (Applied Biosystems) according to the manufacturer’s instruc- 
tions with final primer concentration of 300 nM. Reactions were performed using 
a StepOnePlus Real Time PCR System (Applied Biosystems) with an initial 
incubation at 95°C for 10 min followed by 40 cycles of 15 s at 95°C and 1 min 
at 60 °C. Primers were designed using Primer3 (available at http://bioinfo.ut.ee/ 
primer3/) and assessed for quality using Beacon Designer (http://www.premier- 
biosoft.com/qpcr/) and UNAFold (https://www.idtdna.com/UNAFold). Primers 
used for each reaction were: HTRA1_IE forward: 5'’-GCGTTCATTTTAAGGT 
GCTACAGG-3’, reverse: 5'-TGGGCATTTGTCACGATCAGT-3’; HTRA1_EE 
forward: 5'-GACGTGGTGGAGAAGATCGC-3’, reverse: 5'-AAACCCAGACC 
CACTAGCCA-3’; PRPF19_IE forward: 5'-TCCCCTTGTGTGACCTTCTCT-3’, 
reverse: 5’-AGAATCTCCGTCCATTGTTTGC-3’; PRPF19_EE forward: 5’-AG 
AACTTTAAGACTTTGCAGCTGG-3’, reverse: 5'-TCTCCGTCCATTGTTTG 
CAGA-3'; UBALD2_IE forward: 5’-GCTGCGTTTCCTGACTCCG-3’, reverse: 
5'-GITGGTGGCTGTTGGGAATGT-3’; UBALD2_EE forward: 5'-CAGTTGCT 
GCAGGCGGCC-3’, reverse: 5’-TGGAAGAACGTGCTCAGCGC-3’. 

Ultramer oligonucleotides (Integrated DNA Technologies) were synthesized 
to match the predicted amplicon of each primer pair, and standard curves 
were generated for each reaction using threefold serial dilutions of these control 
templates ranging in concentration from 4.0 X 107’? M to 1.6 X 10 '*M. The 
sequence AAGAA was added to both the 5’ and 3’ ends of each template to 
facilitate primer binding. Control template sequences for intron-exon (IE) and 
exon-exon (EE) regions were as follows: HTRA1_IE: 5‘-AAGAAGCGTTCATTT 
TAAGGTGCTACAGGCTTAAGTGTGTACTCCTTTGGATTTTAGGCTTCCG 
TTTTCTAAACGAGAGGTGCCGGTGGCTAGTGGGTCTGGGTTTATTGTG 
TCGGAAGATGGACTGATCGTGACAAATGCCCAAAGAA-3'; HTRAI_EE: 
5'-AAGAAGACGTGGTGGAGAAGATCGCCCCTGCCGTGGTTCATATCGA 
ATTGTTTCGCAAGCTTCCGTTTTCTAAACGAGAGGTGCCGGTG-GCTAG 
TGGGTCTGGGTTTAAGAA-3'; PRPF19_IE: 5'-AAGAATCCCCTTGTGTG 
ACCTTCTCTCTTTCTATTTCTGGCAGGTAAAGTCACTGATCTTTGACC 
AGAGTGGTACCTACCTGGCTCTTGGGGGCACGGATGTCCAGATCTAC 
ATCTGCAAACAATGGACGGAGATT-CTAAGAA-3’; PRPF19_EE: 5'-AAGA 
AAGAACTTTAAGACTTTGCAGCTGGATAACAACTTTGAGGTAAAGTCA 
CTGATCTTTGACCAGAGTGGTACCTACCTGGCTCTTGGGGGCACGGATG 
TCCAGATCTACATCTGCAAACAATGGACGGAGA-AAGAA-3'; UBALD2_IE: 
5'-AAGAAGCTGCGTTTCCTGACTCCGCCTGGCCCGCCGTGTCACTGCC 
CTGTTTGTCCGCAGACCGCGCTGAGCACGTTCTTCCAAGAAAC-CAACA 
TTCCCAACAGCCACCACAAGAA-3'; UBALD2_EE: 5'-AAGAACAGTTGCT 
GCAGGCGGCCCACTGGCAGTTCGAGACCGCGCTGAGCACGTTCTTCC 
AAA-GAA. 

C, values from each reaction were interpolated on the standard curve generated 
using the corresponding control template to approximate the concentration of 
cDNA template in each experimental sample. These values were then reported as 
the ratio of intron-exon to total (IE + EE) cDNA template in each sample. 
Transcription pulse assay. MYC-ER HMECs with pINDUCER11-shBUD31- 
3'UTR were cultured with/without 16 ng ml! dox and/or with/without 300 nM 
4-OHT. Cells were pulsed with 500 1M 4-thiouridine (4-SU, Sigma) for 2h, and 
collected for total RNA using RNeasy mini kit (Qiagen). 4-SU-labelled RNA was 
purified from 20 jig total RNA. Isolation of newly transcribed RNA was performed 
as described”’ using 100 il streptavidin beads (Miltenyi Biotec). 

Poly(A)* RNA isolation. Dynabeads Oligo(dT).; (Life Technologies) were equi- 
librated with 50 il lysis/binding buffer, and total RNA was heat denatured (70 °C 
for 2 min) before binding poly(A)* RNA to Dynabeads. Isolation of mRNA was 
performed according to manufacturer’s instructions. Poly(A)* RNA concentra- 
tions were measured with a fluorescence plate reader (Molecular Devices) using 
Quant-iT RiboGreen reagent (Life Technologies). 

Poly(A)* RNA LNA FISH. pINDUCER11-shBUD31-3’UTR-transduced MYC- 
ER HMECs were seeded onto collagen-coated 8-well glass chamber slides and 
cultured with/without 16 ng ml‘ dox and with/without 300 nM tamoxifen. Cells 
were treated with/without 2 pg ml actinomycin D (Gibco) or DMSO for 5h 
before fixation in 4% formaldehyde and 5% acetic acid in PBS for 15 min at room 
temperature. Fixed cells were washed with PBS, permeabilized with proteinase K 
(5g ml’, Life Technologies) and treated with/without RNase A (100g ml *, 
Sigma) for 30 min at 37 °C in PBS. Dehydration of the cells was performed with 
70%, 95% and 100% ethanol solutions. FITC-labelled oligo(dT)»; locked nucleic 
acid (LNA) probes were heated to 90 °C for 4 min, then cooled to hybridization 
temperature (55 °C). Dehydrated and dried cells were incubated in 40 nM of LNA 
probes in hybridization buffer (50% formamide, 2 X SSC, 50 mM NaPi, pH 7.0, 
10% dextran sulphate) overnight at 55°C. Chamber slides were washed with 
5 X SSC, 1 X SSC, 0.2 X SSC and PBS, and dehydrated before counterstaining 
with DAPI and mounting with Fluoromount-G (Southern Biotech). Cells were 
imaged using a Nikon Ti-E inverted microscope with 40X air objective and Andor 


©2015 Macmillan Publishers Limited. All rights reserved 


Zyla 4.2 sCMOS camera. For each treatment condition and actinomycin time 
point, =150 cells were analysed for mean FITC intensity. Cellular FITC values 
were adjusted for background fluorescence by subtracting the mean extra-cellular 
pixel value. Image analysis was performed using Nikon Elements. 

Luminescent apoptosis assays. Caspase-3/7 activity was assessed in MYC-ER 
HMECs and breast cancer cell lines by incubating Caspase-Glo 3/7 Reagent with 
cells in triplicate wells of a 96-well plate and measuring luminescence with a plate 
reader (Molecular Devices). Luminescence was normalized using cell numbers 
determined by Hoeschst3321 staining of a duplicate plate, followed by nuclei 
counting using the Celigo Imaging Cell Cytometer (Brooks). 

Tumorigenicity and metastasis assays. SUM159 breast cancer cells were trans- 
duced with pINDUCER11-shBUD31-3'UTR virus and analysed by flow cytome- 
try to confirm >98% transduction. In total 8 X 10° transduced cells were injected 
with matrigel (BD Biosciences) subcutaneously into the flank of four-week-old 
female athymic nude Foxnl-nu mice (Harlan Labs). Tumour volume was mea- 
sured using calipers, and once tumours achieved 150 mm?, mice were randomized 
onto and maintained on sucrose water (— dox) or sucrose water with dox (+dox). 

For mixed population experiments, MDA-MB-231-LM2 breast cancer cells 
were individually transduced with pINDUCER11-shRNAs targeting the indicated 
genes at an MOI appropriate to transduce all cells (1.3-1.5). The individual popu- 
lations were mixed at equal ratios in vitro and expanded before injection. Around 
3 X 10° or 2 X 10° mixed population cells were injected subcutaneously into the 
right flank or into the lateral tail vein of four-week-old female athymic nude 
Foxnl-nu mice (Harlan Labs), respectively. Subcutaneous tumour volume was 
measured with calipers over time. Mice were randomized onto sucrose water 
(—dox) or sucrose water with dox (+dox) after tumours exceeded 150mm’. 
Lung metastatic progression was monitored and quantified using noninvasive 
bioluminescence as described previously’®. When tumours reached 1,000 mm? or 
the total luminescence flux reached 1 X 10°, genomic DNA from dissected tumours 
or lungs were collected using the QlAamp DNA mini kit (Qiagen). qPCR was 
performed with SYBR Green PCR Master Mix (Life Technologies) using manu- 
facturer’s recommendations and the following primers. Experimental target C, 
values were normalized to the TRE C, values, and NCOR2 was used as a negative 
control. 

The following primers were used. BUD31 forward: 5'-TGGAAGACATCTGCG 
TGGTATT-3’, reverse: 5’-CGCGCAAACCTAAAGGCATA-3’; SF3B1 forward: 
5'-GCCGTATCATTAGTACGCCATA-3’, reverse: 5'-TCGATCCTAGGACG 
GGGTAT-3’; MYC forward: 5’-GCCGGCCATATTTTCACTTC-3’, reverse: 5’- 
CACACCTACCGAAAAACAAAC-3'; NCOR2 forward: 5'-AACTTCCGGTGC 
TGTCGTTT, reverse: 5’-CGCGTCCTAGGTAATACGACTCA-3’; TRE forward: 
5'-TGTACGGTGGGAGGCCTATATAA, reverse: 5'-GCGTCTCCAGGCGAT 
CTG-3’. 

For SD6 drug infusion studies, 3 X 10° or 2 X 10° MDA-MB-231-LM2 breast 
cancer cells were injected into the flank or lateral tail vein of four-week-old female 
athymic nude Foxn1-nu mice (Harlan Labs), respectively. For mice with subcutan- 
eous tumours, jugular vein catheters (SAI Infusion Technologies) were surgically 
implanted into each mouse 13-16 days after injection, and were randomized to 
receive vehicle (n = 11) or SD6 (n = 10) infusion. Tail-vein-injected mice were 
randomized to receive vehicle (n = 7) or SD6 (m = 6) infusion 1 day after tail vein 
injections. Animals received daily infusions of vehicle (10% 2-hydroxypropyl-B- 
cyclodextrin dissolved in 50 mM Na,HPO,/NaH>PO,, pH 7.4) or 50 mg kg! of 
SD6 for 20 consecutive days (subcutaneous cohort) or 10 consecutive weekdays 
(tail vein cohort). Mice were infused via jugular catheter at a rate of 3.5 yl min — : 
with a Fusion 200 Touch Syringe Pump (SAI Infusion Technologies). The total 
volume infused did not exceed 500 ul per day. Subcutaneous tumour volumes were 
monitored with calipers, and lung metastatic progression was monitored with 
noninvasive bioluminescence. Mice were euthanized once tumours reached 
2,000 mm? or the total luminescence flux reached 1 X 10”. In progression-free 
survival analyses, progression is defined as fivefold increase in pulmonary bio- 
luminescence relative to initial values or fourfold increase in subcutaneous tumour 
volume relative to its volume at time of randomization. 

Investigators responsible for monitoring and measuring the xenografts of indi- 
vidual tumours were not blinded. Simple randomization was used to allocate 
animals to experimental groups. All animal studies were performed in accordance 
with institutional and national animal regulations. Animal protocols were 
approved by the Institutional Animal Care and Use Committee at Baylor 
College of Medicine. 

Power analysis was used to determine appropriate sample size to detect signifi- 
cant changes in animal survival, which were based on previous survival analyses in 
our laboratory. All animals were included in analyses. 

Pooled shRNA screens in breast cancer cell lines. Pooled shRNA screens were 
performed on 68 breast cancer lines and four non-malignant immortalized 
mammary epithelial lines, essentially as described**. In brief, cells are infected with 


LETTER 


alentiviral shRNA library at a MOI of 0.3, and passaged under standard conditions. 
At 4 and 8 doublings, respectively, DNA is isolated and hybridized to a customized 
chip to assess shRNA dropout. A detailed description of the results of these screens 
will be published separately (R.M., A. S. and B.G.N., manuscript in preparation). 
Correlation between MYC dependency and spliceosome dependency. First, 
to calculate MYC dependency scores using probable on-target hairpins, we used 
assay observations associated with 3 hairpins incorporated into the first ATARIS*” 
solution for MYC. MYC dependency scores were generated using a hierarchical 
linear model, with pooled shRNA screen observations as the independent variable 
and two regression covariates: initial signal intensity (with coefficient fo) and 
linear time-course dropout trend (with coefficient /,). The dropout trend is cal- 
culated for each cell line separately, resulting in a per-cell-line MYC dropout score 
(the value of coefficient 1). 

Second, the MYC dependency score was used in a hierarchical linear model to 
search for associations with the essentiality of other genes (such as spliceosome 
encoding genes). This model uses pooled shRNA screen observations as the inde- 
pendent variable and three regression covariates: initial assay signal intensity 
(with coefficient f,), linear time-course dropout trend (with coefficient /,), and 
an interaction term between dropout trend and MYC dependency score (with coef- 
ficient $y). The P-value associated with the interaction term [3 is used to determine 
whether a significant association exists. A detailed description of this approach will 
be published elsewhere (A.S., R.M. and B.G.N., manuscript in preparation). 

A summary statistic using results from the single-gene analyses was used to test 
the significance of the association between MYC dependency and the essentiality 
of a gene set. For a gene set containing genes g, we calculate the gene set summary 
statistic as 


- >. sign(By,)log,)(P value(fy,)) 


gene g 


in which sign(f4) and Pvalue(f) indicate the values associated with the regres- 
sion coefficient Py. The resulting metric, termed a siMEM (mixed-effect model) 
score, indicates the significance and correlation between sensitivity to MYC 
shRNAs and sensitivity to a group of shRNAs targeting a gene set (such as those 
targeting the spliceosome). A gene set (for example, spliceosome genes) for which 
a substantial number of genes are significantly associated with MYC dependency, 
and all with the same direction (sign) of association, will have a large positive score. 
When calculated for the gene set consisting of the core spliceosome, this value 
summarizes the direction and strength of the significance observed across genes in 
the spliceosome. To determine whether this observation is significant, the same 
statistic is calculated for 100,000 randomly drawn gene sets of the same size as the 
core spliceosome, yielding the null distributions of gene set summary statistics in 
Fig. 4b, c. 
Statistical analysis. All experiments were performed on biological replicates 
unless otherwise specified. Sample size for each experimental group/condition is 
reported in the appropriate figure legends and methods. For cell culture experi- 
ments, sample size was not predetermined, and all samples were included in 
analyses. For significance testing, analyses were chosen if data met the assumptions 
of the tests. Data was checked for comparable variance before statistical analysis. 
Statistically significant differences between control and experimental groups were 
determined using two-tailed unpaired Student’s t-test, one-way ANOVA with 
Tukey-Kramer minimum significant difference test, Mann-Whitney test, 
Kolmogorov—Smirnov test, Wilcoxon test, permutation-based test of significance, 
and log-rank test as indicated in the appropriate figure legend and methods text. 


31. Meerbrey, K.L. eta/. The pINDUCER lentiviral toolkit for inducible RNA interference 
in vitro and in vivo. Proc. Nat! Acad. Sci. USA 108, 3665-3670 (2011). 

32. Schuhmacher, M. et al. Control of cell growth by c-Myc in the absence of cell 
division. Curr. Biol. 9, 1255-1258 (1999). 

33. Malovannaya, A. et al. Streamlined analysis schema for high-throughput 
identification of endogenous protein complexes. Proc. Nat! Acad. Sci. USA 107, 
2431-2436 (2010). 

34. Zapp, M.L.& Berget, S. M. Evidence for nuclear factors involved in recognition of 5’ 
splice sites. Nucleic Acids Res. 17, 2655-2674 (1989). 

35. Dignam, J. D., Lebovitz, R. M. & Roeder, R. G. Accurate transcription initiation by 
RNA polymerase Il in a soluble extract from isolated mammalian nuclei. Nucleic 
Acids Res. 11, 1475-1489 (1983). 

36. Echeverria, G. V. & Cooper, T. A. Muscleblind-like 1 activates insulin receptor exon 
11 inclusion by enhancing U2AF65 binding and splicing of the upstream intron. 
Nucleic Acids Res. 42, 1893-1903 (2014). 

37. Ddlken, L. et a/. High-resolution gene expression profiling for simultaneous kinetic 
parameter analysis of RNA synthesis and decay. RNA 14, 1959-1972 (2008). 

38. Marcotte, R. et al. Essential gene profiles in breast, pancreatic, and ovarian cancer 
cells. Cancer Discov. 2, 172-189 (2012). 

39. Shao, D. D. et al. ATARiS: computational quantification of gene suppression 
phenotypes from multisample RNAi screens. Genome Res. 23, 665-678 (2013). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


ao 10 D129 c ae d 

8 . 100 
6 3 5 cDNA: GFP BUD31 
< iS - » 80 pat ptt 
z = 80 $8 shBUD31(UTR): - + - + 
E = 85 
= 05 s c 3 60 Flag-GFP 
5 0. 
Q ° DE 40 Flag-BUD31 
a £ 40 = 85 incu 
© & 57 Vinculin 
A & xz 20 
© 0.0 0 0 

A or” go Control = MYC Control == MYC 

MO 3 m -shBUD31 mi -shBUD31 

m +shBUD31 m +shBUD31 


Extended Data Figure 1 | Validation of BUD31 as a MYC-synthetic lethal _ biological replicates, **P < 0.01, two-tailed Student’s f-test). c, Caspase-3/7 
gene in HMECs. a, qRT-PCR analysis of BUD31 mRNA level (mean = s.d., activation by caspase luminescence assay (mean + s.e.m., n = 3, ***P < 0.001, 
n = 3 biological replicates). b, Clonogenicity of MYC-ER HMECs with or one-way ANOVA). d, Flag-tagged protein levels in MYC-ER HMECs in 
without MYC hyperactivation or BUD31 depletion (mean + s.e.m., n = 4 which vinculin was used as a loading control. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b SNRPB 
Core spliceosomal proteins PNRRD! 
*Red indicates BUD31-associated protein SNRPD3 | Sm 

ACIN1 DDX23 HNRNPF __ISY1 PLRG1 PRPF6  SF3B2 SNRPD2  SRSF7 NRE 
AQR DDX35 HNRNPH1 LSM2 PPIE PRPF8  SF3B3 SNRPD3  SRSF9 SNRPG 
BCAS2 DDX41 HNRNPH3 LSM3 PPIG PTBP1  SF3B4 SNRPE SYF2 SNRNP70 
BUB3 DDX46 HNRNPK = _LSM4 PPIH PUF60 —-SLU7 SNRPF SYNCRIP snrec | OT STRNP 
BUD31. — DHX15 HNRNPL —_LSM6 PPIL1 RALY SMNDC1 SNRPG TXNL4A aie 
C9orf78 DHX8 HNRNPM — _LSM7 PPIL2 RBM17 SNRNP200 SNW1 U2AF1 aires 
CACTIN EFTUD2 HNRNPR _LSM8 PPIL3 RBM22. SNRNP27 SRP40 U2AF2 
CD2BP2 FAMS0A HNRNPU MAGOH  PPWD1 RBM8A_ SNRNP40__ SRP54 U2SURP PHERE 
CDC40 ~HNRNPAO HNRNPUL1 MATR3 PQBP1 RBMX SNRNP70 SRRM1 USP39 REMIZ 
CDC5L HNRNPA1 HSPA4 NAA38 PRPF18 SART1  SNRPA SRSF1 WBP11 smnpc1 | U2 snRNP 
CHERP HNRNPA2B1 HSPAS5 NFATC2IP PRPF19 SF3A1 SNRPA1 SRSF12. WDR83 SNRPAI 
CRNKL1 HNRNPA3 ~~ HSPA8 NHP2L1 PRPF3 SF3A2  SNRPB SRSF2 XAB2 SNRPB2 
CTNNBL1 HNRNPC HTATSF1 NOSIP PRPF31 SF3A3  SNRPB2 SRSF3 YBX1 Bleed 
CWC15  HNRNPD HTRA2 PABPN1 PRPF4 SF3B1  SNRPC SRSF4 ZMAT2 pane 
CXorf56 HNRNPDL IK PHF5A PRPF4B SF3B14 SNRPD1 SRSF6 a 

Q Intron lariat Boies 
c eel Exoni2 cwc15 

<@ e = ISY1 
mel \ @ 2 RBM22 PRP19/CDC5L 


‘a \ Post- 
spliceosomal 1.5 PQBP1 
ex | complex PPIL1 
Pre- c bg 4 
spliceosome . PPIE 
} © BUD31-associated snRNP 
2. NHP2L1 
0.5 
Be USP39 
Catalytic step 


PLRG1 


Precatalytic 2 
spliceosome a Tispliceosorne: eA 
SNRNP27 | U4/U6 snRNP 
@ Activated PPIH 
spliceosome 
PRPF3 
i PRPF31 
d 1% input IP: Flag e 80 
FI 3 LSM3 
Flag- lag- oO 
Control Control © 60 LSM4 
BUD31 BUD31 4 wen 
RNase - + - + - + - + L 40 LSM7 
—— 5 = LSM8 
r= 5 CD2BP2 
a 
0 SNRNP40 
© Cy Y Cy U5 snRNP. 
ee > eS Po OF PRPF6 
LA o 
& TXNL4A 
Nuclear extracts Nuclear extracts @ 
f -shBUD31 +shBUD31 Re - cDC40 
@ op 1.0 Second step 
Hours 0 2.5 2.5 0 2.5 2.5 e 3 PRPF18 
ATP_ + i: : bs at: = shBUD31 O22 O5 PDGFRA 
a bt —- o£ 
4 2 = i | — vom ew soe 2 ze EPHA8 Negatives 
=< 2 Bud31 oS 0 
siecle « oN aN TRIM9 
1 2 Vinculin oe 3 soo y 
SS LL EE 
g h Nuclear extracts Nuclear extracts 
Nuclear extracts Nuclear extracts -shBUD31 +shBUD31 
-shBUD31 +shBUD31 o 1 3 6 9 9 0 4 3 6 9 9 Minutes 
0 25 30 36 35 0 25 30 35 35 Hours aC a RP = AIP 
+ + + + + + + + - ATP 
2. | Complex A 
Complex H 


1.0 J 60 
0.8 -shBUD31 ‘és 
= +shBUD31 40 
0.6 


0.2 
a 6 
0.0 = : BUD31 bait! WT Mut WT Mut WT Mut 
3 6 9 9 Minutes : 
§: ye 4: i ES 2 ATP Prey: SNRPF PRPF19 Negative 


Ratio of complex A/H 
band intensity 
Oo 
de 
Percent YFP* cells 
N 
Oo 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 2 | BUD31 interacts with core spliceosomal factors 
and is required for spliceosomal assembly and pre-mRNA splicing. a, 134 
core spliceosomal proteins are listed. Proteins in red are shown to interact 
with BUD31, as discovered by Flag-BUD31 immunoprecipitation mass 
spectrometry and BUD31 BiFC. b, Heat map of BUD31-interacting 
spliceosomal proteins, organized by spliceosome sub-complexes. A black-green 
colour scale depicts normalized BiFC interaction values between spliceosomal 
proteins and negative control protein (technical replicates in two left lanes) 
and BUD31 (technical replicates in two right lanes). c, Spliceosomal snRNPs 
(coloured circles) interact in a stepwise manner to excise intronic sequences 
from pre-mRNA. snRNPs with proteins identified from the BUD31 immuno- 
precipitation and mass spectrometry are noted (blue outline) to be BUD31- 
associated. d, Co-immunoprecipitation of Flag-BUD31 for non-spliceosomal 
proteins. Input and immunoprecipitation blots probed by EIF2S1 and EIF31 
were taken at different exposures to minimize background signal. e, Interaction 
between N-YFP-tagged BUD31 and C-YFP-tagged spliceosomal (DDX46) or 
cytoplasmic proteins (TRIM9, SOCS2 and EPHA8) was assessed by cellular 


fluorescence (mean + s.e.m., n = 3 technical replicates). f, Nuclear extracts 
with or without BUD31 knockdown were incubated with pre-mRNA substrate, 
and RT-PCR of unspliced RNA (top) and spliced RNA (bottom) was 
performed, using primers at the indicated arrows (left). BUD31 protein levels in 
the nuclear extracts were normalized to vinculin expression (middle) and 
quantified (right). g, Radioactively labelled pre-mRNA (MINX) was incubated 
with nuclear extracts with or without BUD31 depletion. RNA purified from 
the splicing reaction was run on a denaturing gel and imaged by autoradio- 
graphy. The identities of prominent bands are based on size. Asterisk denotes 
putative intron-lariat band. h, After in vitro splicing was performed as 
described previously, products were electrophoresed on native gel, and 
spliceosome complexes were visualized by autoradiography. Complex A and 
nonspecific H complexes are labelled. i, Phosphorimager quantification of the 
ratio of RNA in complex A compared to that in complex H. j, Interaction 
between N-YFP-tagged wild-type (WT) or mutant BUD31 and C-YFP-tagged 
splicing factors was assessed by cellular fluorescence (mean + s.e.m., n = 2 
technical replicates, ***P < 0.001, two-tailed Student’s t-test). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


6 Control MYC Control HER2 EGFR 
g 0 + g O73 — Control + - Control + 
5 if E HER2 - + EGFR - + 
= 2 
9 ‘a HER2 EGFR 
8 -20 4 3 -204 J | @ 
= £ Vinculin ee) Vincuiin 
8 g | Ws = | = 
s -40 5 & -404 N.S. 
2 S 
x * x = -shBUD31 
-60 4 -60 4 ™ +shBUD31 Cg. 
* Control 
b S64 = myc 
x 
Control + - Control + a ‘ 
HER2 - + EGFR - + £ 4 
c-Myc =| c-Myc = 
82 
Ran Ran 
0- 


Extended Data Figure 3 | HMECs with oncogenic activation of HER2 
and EGFR do not require BUD31. a, Cell number changes in HMECs 
with inducible shBUD31 and constitutive HER2 or EGFR expression 
(mean + s.e.m.; n = 4 technical replicates; *P < 0.05, two-tailed Student’s 


©2015 Macmillan Publishers Limi 


2 3 4 5 6 
Days in culture 


t-test). HER2 and EGFR protein is normalized to vinculin (right). b, MYC 
protein levels in HMECs with constitutive HER2 or EGFR expression. c, MYC 
induction by tamoxifen in MYC-ER HMECs does not increase cell proliferation 
over time (mean = s.e.m., n = 8 technical replicates). 


ited. All rights reserved 


LETTER 


£3) 
lo” 
fo) 

Qa 


o 2 o 
8 - 8 g o = 
& 10 a 2 @ 10 
s ii z s < | @ 
E E E E 
08 = 5 08 
fo = fa) Le 
oo) < 2 oe 
B o> a te Zz 
0.6 2 wo D 06 _ 
g g oe g 
s ze $ > 
o a PJ s 
O 3 & g 
ow ow e x 


e120 f g h 

5 iS 

eg. | wsnsrset 7 = -shU2AF 1 5 5 m-shSNRPF 

@ S g9 4 m=+shSF3B1 G8 m+shU2AF1 ag m= -shEFTUD2 ae 

86 eS mee) BE m= +shSNRPF 

is) og os m+shEFTUD2 865 

© 60 ef Be es 

2 E 40 bE 32 32 

£2 o5 ce ce 

2. 20. S gs gs 

eS ° i) 3S 

0 = & & 
Control MYC Control MYC Control MYC Control MYC 
ane 
= 100 
32 80 = Vehicle 
85 = SD6 
<% 60 
‘oa 
oo 
§> 20 
=e 0 
Control MYC 
Extended Data Figure 4 | Partial knockdown of core splicing factors is luminescence in MYC-ER HMECs with partial suppression of core 
MYC-synthetic lethal in HMECs. a-d, mRNA levels for core splicing spliceosomal proteins (e-h) or spliceosome inhibitor SD6 (i) (mean + s.e.m., 


factors SF3B1 (a), U2AF1 (b), EFTUD2 (c) and SNRPF (d) were evaluated n= 3 technical replicates, ***P < 0.001, one-way ANOVA). 
by qRT-PCR (mean + s.d., n = 3 technical replicates). e-i, Caspase-3/7 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


2 
Lom 


20 Control MYC 


| 
° I 
N.S. 


20 5 Control MYC 


-40 


-60 


% change in nuclear poly(A)+ 
fluorescence after actinomycin D 


-80 


% change in cellular poly(A)+ 
fluorescence after actinomycin D 


= -shBUD31 -shBUD31 
m= +shBUD31 = +shBUD31 


Extended Data Figure 5 | BUD31 loss in MYC-hyperactivated cells 
destabilizes mRNA. a, b, MYC-ER HMECs with inducible shBUD31 treated 
with actinomycin D for 5 h were labelled with oligo(dT).; LNA probes via 
fluorescence in situ hybridization. Cellular FITC intensity was assessed within 
cellular (a) and nuclear (DAPI+) (b) regions. Data are represented as the 
difference in cellular FITC intensity between 0 and 5h of actinomycin D 
treatment in each cell state (mean + s.e.m., n = 150, ***P < 0.001, two-tailed 
Student’s t-test). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


\ 


# 
¥ & és oe 
N SP Rar < oe tt OY 
| ow x ee ae PN So Pe FSH GMM = & 


“HT bes iebotaeae — vo 


Unfolded protein binding 
Cytoskeleton constituent 
DNA repair 

DNA replication 

Telomere capping 

Cellular metabolic processes 
Mitosis 

ER unfolded protein response 
Apoptosis 

Non-essential genes 


° 


Normalized barcode 
abundance (log,) 
ie 
1 


Extended Data Figure 6 | BUD31 depletion in MYC-hyperactivated cells genes by shRNA decreased cell viability (mean barcode abundance = s.e.m.). 
enhances intron retention and decreases expression of cell-essential genes. | Twofold decrease in barcode abundance is noted by the dashed red line. 

In MYC-hyperactive cells, 17 representative genes display increased IR and All values are reflective of three biological replicates, and genes are colour- 
decreased steady-state RNA levels after BUD31 knockdown. Depletion ofthese coded based on their Gene Ontology term annotation. 


©2015 Macmillan Publishers Limited. All rights reserved 


a b 
100 
8 
2 80 
2 
3 60 
oO 
8 40 Fae 
3 20 
ir 
A 
ie) o> 
oe xe? 
C1000 5 d 
& 4  @-shBUD31 (-dox) 
= lm +shBUD31 (+dox) 
z 
| 
[o} 
> 
6 
= 
a 
oO 
i=) 
oO 
g él 
ss 0 T T T T T i= 
0 7 14 21 


Days after randomization 


Extended Data Figure 7 | MYC-dependent breast cancer cells require 
BUD31 for in vitro and in vivo growth. a, Relative cell number of SUM159 
cells with doxycycline-inducible shBUD31 in vitro (mean + s.e.m., n = 8 
technical replicates, ***P < 0.001, two-tailed Student’s t-test). b, Caspase-3/7 
luminescence in BUD31-depleted SUM159 cells (mean + s.e.m., n = 3 
technical replicates, ***P < 0.001, two-tailed Student’s t-test). c, d, SUM159 


LETTER 


800 KR 


% change in casp. 3/7 
luminescence 
> 
Oo 
Oo 


-shBUD31 (-dox) 


+shBUD31 (+dox) 


P<0.0001 


Progression-free survival 


0 20 40 60 80 


Days after randomization 


cells engineered with dox-inducible shBUD31 were subcutaneously 
transplanted into mice and randomized onto dox treatment (—dox n = 10, 
+dox n = 9). Loss of BUD31 in SUM159 xenografts inhibits tumour growth 
(mean + s.e.m., ***P < 0.001 at day 21, two-tailed Student’s t-test) 

(c) and prolongs progression-free survival (d) in nude mice (P-value, 
log-rank test). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 8 | BUD31 depletion does not affect levels of MYC 
protein. a, MYC protein levels in MYC-ER HMECs with inducible 
shBUD31 expression normalized to vinculin expression. To confirm specificity 


a b 
MYC-ERHMEC — HMEG 
LM2 SUM159 
shBUD31 = Ut UHC 
Wyee ae ee ce shBUD31 - + - 
shMYC - - - - = + sik 


Myc-ER 


I~ Endogenous c-Myc 


~~ ow = Vinculin 


of MYC antibody, HMECs without the MYC-ER construct were engineered inducible MYC shRNA. 


©2015 Macmillan Publishers Limited. All rights reserved 


to express inducible MYC shRNA. b, MYC protein levels in SUM159 and LM2 
cells with inducible shBUD31 normalized to vinculin expression. To confirm 
specificity of MYC antibody, SUM159 cells were engineered to express 


LETTER 


Tumor 1 


\. 


ose -dox 
© 

Gs RES) ca 
Ss 


Initial transplanted population 


fea 


i 
_~) = shControl 


(© = Candidate shRNAs Tumor 2 


Extended Data Figure 9 | Schematic for in vivo barcode-based competition 
assay. LM2 cells transduced with inducible shRNAs targeting negative 
control genes or candidate genes were mixed at an equal ratio. This mixed 
population was transplanted into mice, and tumours were allowed to form in 
the presence or absence of dox. At the experimental endpoint, genomic DNA 
was isolated for comparisons of relative barcode (shRNA) abundance in 
tumour genomic DNA. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 100 b 
120 Vehicle 
& 80 SUMS . = 100nM SD6 
§ a LM2 2 
= 60 E 80 
e v F7 c 
fe} = 
8 HME1 8 
° 40 * @ 
r= = 40 
x x 
e 20 & ee 
0 0 
0.0 0.5 1.0 15 20 MYC OFF MYC ON 
log[SD6] nM 
Cc Lung metastasis 
P=1.22x102 
100 
g 
S 80 
a 
2 60 
T —1 Vehicle 
& 40 
8 _1 SD6 
> 20 
a 


20 30 40 50 60 
Days after treatment 


Extended Data Figure 10 | Spliceosome inhibitor SD6 inhibits MYC- four days, and cells were counted for relative cell number changes 
dependent cancer cells in vitro and in vivo. a, MYC-dependent breast cancer (mean ~ s.e.m., 2 = 3 biological replicates, ***P < 0.001, one-way ANOVA). 
cells (SUM159 and LM2) and MYC-normal immortalized epithelial cells (F7 _c, Kaplan-Meier survival analysis of nude mice with pulmonary seeding of 
and HME}1) were cultured with SD6 at low density and analysed for clonogenic | LM2 cells treated with or without SD6 for 10 days (vehicle n = 7, SD6 n = 6, 
growth. b, MYC-repressible human B-cell line P493-6 was treated with or P-value by log-rank test). 

without 100 nM SD6 in the absence or presence of MYC hyperactivation for 


©2015 Macmillan Publishers Limited. All rights reserved 


Pod ial os 


doi:10.1038/nature14866 


Replisome speed determines the efficiency of the 
Tus— Ter replication termination barrier 


Mohamed M. Elshenawy’, Slobodan lergic’, Zhi-Qiang Xu*, Mohamed A. Sobhy’, Masateru Takahashi‘, Aaron J. Oakley’, 


Nicholas E. Dixon? & Samir M. Hamdan 


In all domains of life, DNA synthesis occurs bidirectionally from 
replication origins. Despite variable rates of replication fork pro- 
gression, fork convergence often occurs at specific sites’. Escherichia 
coli sets a ‘replication fork trap’ that allows the first arriving fork to 
enter but not to leave the terminus region’*. The trap is set by 
oppositely oriented Tus-bound Ter sites that block forks on 
approach from only one direction* ’. However, the efficiency of fork 
blockage by Tus- Ter does not exceed 50% in vivo despite its appar- 
ent ability to almost permanently arrest replication forks in vitro** 

Here we use data from single-molecule DNA replication assays and 


Permissive 


PTerB NPTerB 


DNA synthesis (kbp) 


ae ae ee ae es et Es Ca 
100 150 200 250 300 350 400 


Cytosine-binding 
a ke 


ca 
5’ mragrracaacatA 


structural studies to show that both polarity and fork-arrest effi- 
ciency are determined by a competition between rates of Tus dis- 
placement and rearrangement of Tus- Ter interactions that leads to 
blockage of slower moving replisomes by two distinct mechanisms. 
To our knowledge this is the first example where intrinsic differences 
in rates of individual replisomes have different biological outcomes. 

In the circular E. coli chromosome, two replication forks move from 
the replication origin to converge opposite in a region that contains ten 
23-base-pair Ter (termination) sites and the dif site for chromosome 
segregation®”’ (Fig. 1a). The Ter sites are arranged in two oppositely 


Non-permissive © 


aed rm 
core 40.1 
By 3.6 ko —« 
ite 
TerB SI Bead 
DNA shortening 
Clamp DnaB 
loader 
Flow 


3’ ATCAATGTTGTATINGY 


Ona 
nap ? . ‘4 
57288, G6 Lagging 5 5 


mw Bypass 
m@ Stop 
+Tus_ m Restart 
(N = 37) 


Percentage of events (%) 


0 50 
Time (s) 
f g Stop/restart Bypass h <2 
Rate = Rate = Sz 8 ?Laaging Lae 
= 50 (43%) (45%) 840 + 40 bps? 1,700+140bps? & 
s °, ) £ (N = 31 (N = 33) oO L 
@ 40 . Stok ( ) ( ) g 100 5 ey C6-NPTerB m Bypass 
io 3 _ o w@ Stop 
230 ‘S 1g 5 sof w Restart 
£ BF ° 
¢ 20 3 6L s 60 +Tus 
2 E £ aol W=27) we 14) 
‘o 10 5%) 24 F g L 
~~ s 0, 5 20+ 
° *; oer 
1 1 1 1 1 OF oF 
Pope iviviriit 
8 20 “0 BO 80 0 1,000 2,000 3,000 
Tus concentration (nM) Rate (bp s~’) 


Figure 1 | Fate of the E. coli replisome upon encountering Tus- TerB. 

a, Polarity of the replication fork trap. Replication occurs bidirectionally from 
oriC. Each fork passes through the first five permissive (P) Ter sites, but is 
arrested on encounter with one of the next five non-permissive (NP) sites. 

b, Structure of the Tus—Ter locked complex (PDB: 2106)°. Strand separation of 
C6 (yellow) at the NP face induces its flipping into a specific binding pocket on 
Tus. c, Schematic of the single-molecule setup for observing leading-strand 
synthesis, which converts the tethered dsDNA (long) to ssDNA (short), 
displacing the bead opposite the flow. d, Representative synthesis trajectories 
upon encountering Tus-TerB oriented to the P (PTerB, left) or NP (NPTerB, 
right) faces. kbp, kilobase pairs. e, Percentages of forks that bypassed, 
transiently or fully stopped at the P or NP face. Error bars correspond to 
standard deviations of binomial distributions; N = 60, 64 and 37 for NPTerB 


(—Tus), NPTerB (+Tus) and PTerB (+Tus), respectively. f, Effect of Tus 
concentration on arrest activity at the NP face. Tus was present continuously 
with the replication proteins. Washing excess DNA-unbound Tus (80 nM) 
before introduction of replication proteins resulted in 38% stoppage. g, Rate 
dependence of replication stalling at the NP face. Rate distributions of events 
that bypassed (grey; N = 33) or stopped/restarted (blue bars; N = 31) were fit 
with Gaussian distributions. Fit lines are shown; the uncertainty corresponds to 
the standard error. h, Percentages of forks that bypassed, transiently or fully 
stopped at Tus bound to the NPTerB site containing a bubbled-DNA structure 
in place of base pairs 3—7 of TerB, while keeping C6 (5-mismatch C6-NPTerB). 
Error bars correspond to standard deviations of binomial distributions; N = 14 
and 27 in absence and presence of Tus, respectively. 


‘Division of Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia. *Centre for Medical & Molecular Bioscience, Illawarra 
Health & Medical Research Institute and University of Wollongong, New South Wales 2522, Australia. 


394 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


oriented groups”’, and each of them is tightly bound to the monomeric 
protein Tus*’®. The lack of symmetry in Ter sequences fixes the ori- 
entation of the Tus-Ter complex such that forks are blocked at its 
non-permissive (NP) face, but allowed to pass from the permissive 
(P) end®"'. The two Ter clusters thus form a trap from which the first 
arriving fork can enter but not leave, awaiting arrival of the other’. 

The mechanism determining polarity of Tus-Ter action serves as a 
model for communication between replication forks and double- 
stranded (ds) DNA-binding proteins, but it is also controversial. 
Strand separation by the DnaB helicase at the NP face can engineer 
a new structure, the ‘locked’ complex of the mousetrap model 
(Fig. 1b)°. Cytosine(6) of Ter flips out of the DNA helix to form new 
interactions in a pocket on Tus that markedly prolongs the lifetime 
(>40-fold) of the Tus—Ter complex, protecting the central interactions 
from the trailing polymerase. Conversely, strand separation at the 
P face rapidly dissociates Tus. 

Despite the stability of the locked complex in vitro, at any sampling 
time in reporter plasmids in vivo, even when Tus is overproduced, 
~50% of forks moving towards the NP face displace Tus*’. This could 
be due either to the fork block being transient or to its low efficiency of 
formation. The Kp of the Tus-TerB locked complex is only threefold 
lower than Tus—dsTerB while its lifetime is much longer®, so we tested 
the hypothesis that the efficiency of lock formation is kinetically con- 
trolled, that is, that NP fork-arrest efficiency is determined by com- 
petition between lock formation and Tus displacement, dependent on 
the rate of fork approach. Inherent inefficiency of fork arrest would 
also explain the presence of backup Ter sites in the chromosome. 


LETTER 


We used single-molecule imaging to monitor the fate of the E. coli 
leading-strand replisome as it approaches Tus-TerB from either dir- 
ection. Real-time synthesis trajectories were derived from multiplexed 
arrays by monitoring the length of individual DNA molecules’*"*. The 
forked primer-template DNAs, each with a single TerB site 3.6 kilobase 
(kb) from the site of fork assembly, were tethered between the surface ofa 
coverslip and a magnetic bead (Fig. 1c and Extended Data Fig. 1) and 
extended by a laminar flow exerting a 2.6 pN drag force on the beads. The 
trajectories (Fig. 1d) show DNA shortening through its conversion 
from ds (long) to single-stranded (short) during leading-strand 
synthesis. The position of TerB could be defined to +0.1 kb under this 
force regime (see Methods). Consistent with previous single-molecule 
studies of DNA replication’’"”, rates of DNA synthesis vary among 
replisomes (Extended Data Fig. 2), reflecting the in vivo situation’®. 
The source of this heterogeneity is unknown (Supplementary 
Discussion 1). 

In the absence of Tus, 5 + 3% of forks that reached the TerB site 
stopped there by chance (Table 1, Fig. le, Extended Data Fig. 3a, b and 
Supplementary Discussion 2). With Tus-TerB oriented with its P face 
towards the fork (PTerB), this frequency increased to 11 + 5% (includ- 
ing the 5% random stoppage), presumably owing to forks encounter- 
ing a strong protein-DNA roadblock (Fig. le and Supplementary 
Discussion 3). Transient stoppage followed by resumption of synthesis 
occurred in 5% of trajectories, and in the remaining 84%, replication 
forks displaced Tus and continued synthesis without stopping, even 
transiently (Fig. 1d, e). The average rate of DNA synthesis was other- 
wise unaffected by Tus (Extended Data Fig. 3c). 


Table 1 | Fate of replisomes and fork rate dependencies of events at Tus-bound Ter sites 


5’-AATAAGTATGTTGTAACTAAAGT P Tus Stop (%) Bypass (%) Restart (%) Pause time (s) Stop/restart rate (bp s~?) Bypass rate (bp s+) 
Ter \p (prATTGATACAACATTGATTTCA-5’ 
sencsesccosscssocoscsese™ ”°! WTTus 11+5 84+6 5+4 31+ 24 330 + 30 300 + 110 (1,250 + 120) 
Ceecerecesoccosoooooseso.. 
PTerB 
NoTus 5+3 95+3 0 = 800 + 100 ,160 + 70 (930 + 70) 
DnaB Sas ccepnnecennenoqnonoses WTTus 45+6 52+6 342 146+ 31 890 + 70 (840 + 40) ,690 + 100 (1,700 + 140) 
Pol qeerewnenowerenoveneoonese: H144A  27+8 55+9 18+7 33 +5 840 + 120 (740 + 10) 920 + 130 (1,520 + 120) 
NPTerB [GC(6)-NPTerB] R198A* 5+5 7749 18+8 14+4 400 + 40 300 + 140 (1,310 + 70) 
DnaB c WTTus 8+4 47+8 45+8 29+ 6(37 + 6) 820 + 70 (720 + 80) 810 + 110 (1,740 + 60) 
ei WTTust 18+9 41+11 41+11 56+13 800 + 80 (790 + 70) 820 + 180 
CG(6)-NPTerB 
DnaB T 
- sos SUSUREESESSEISSEISIES WTTus 7+4 T3427 20+6 22+5(24 + 2) 380 + 60 (360 + 40) 350 + 110 (1,310 + 50) 
TA(6)-NPTerB 
DNaB Coo ggeoeetgooococoooccoocs WTTus 89+6 11+6 0) - 1160 + 130(1030 + 130) 1,230+310 
Pol @8® : R198A* 86+7 14+7 0) - 1130 + 120 (950 + 170) ,200 + 270 
5-mismatch C6-NPTerB 
Des Sucseececeasseaseaseaseass WTTus 8+5 15:7 77 +8 111 +18 (177+ 20) 1130+ 140 (960 + 170) 230 +170 
fe) 
5-mismatch G6-NPTerB 
DnaB “soesseueseensscasssenssss WTTus 32+9 68+9 0) = 390 + 70 (360 + 40) 480 + 140 (1,320 + 60) 
Pol 
TA(5)-NPTerB 
DnaB Cocgeegeececoocooooooooccs WTTus 41+9 59+9 0) = 880 + 110 (860 + 90) 480 + 120 (1,550 + 110) 
Pol quetecsececcoscososcosoose 
Swapped F4n GC(6)-NPTerB 
DnaB 8 ggegggnccosocoosccooscce WTTus 2247 75+7 323 186 400 + 60 (310 + 70) ,260 + 100 (1,260 + 60) 
po] eromoscovesseosoosossosoes 
Swapped F5n GC(6)-NPTerB 
DnaB Ceegegesccccoocooooooonoss WTTus 19+6 70+8 11+5 180 + 26 400 + 60 (350 + 20) 320 + 100 (1,300 + 60) 
po] eromosacveesoosoosossessoe 
NPTerH 


The TerB site and its alterations are depicted in cartoon with the native nucleotides in black except that C6 is in yellow and substituted nucleotides are in magenta. The directionality of the replication fork is shown 
with the strands on which Pol III holoenzyme and DnaB translocate. The nucleotides in native TerH that differ from TerB are also shown in magenta. The sequences of oligonucleotides used to assemble the variants 
of TerB substrates are given in Extended Data Fig. 1b. 

Stop, bypass and restart events were quantified as a percentage of all events that reached or bypassed TerB (see Methods). Uncertainties correspond to standard deviations of binomial distributions. The mean of 
the rates and pause durations are shown either as their arithmetic averages or, in parentheses, by fitting their histograms with Gaussian or exponential decay distributions, respectively (see Fig. 2a, c and Methods). 
The uncertainties correspond to standard errors. Each experimental condition represents results from three or four technical replicates and the number of derived molecules (N) is specified in the corresponding 
figures. 

*Concentration of Tus(R198A) was 250 nM versus the standard 80 nM. 

+Reaction from which the restart proteins PriA, PriB and DnaT were omitted. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 395 
©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


In contrast, when the fork approached the NP face (NPTerB), per- 
manent stoppage (9.7 + 1 min) of DNA synthesis occurred in 45% of 
trajectories, and restart in only 3% (Fig. 1d, e and Extended Data Fig. 
2a, b). The remaining 52% showed no sign even of transient stoppage. 
All TerB sites were Tus-bound under our experimental conditions 
(Fig. 1f), indicating they have an inherently low efficiency of fork 
arrest. We are thus able for the first time to distinguish between the 
two different mechanisms that could explain the in vivo data®”. 

Fork arrest is attenuated in vivo by DNA supercoiling, suggesting 
that it is affected by the rate of strand separation®. To test this pro- 
position, we separated the trajectories that showed full or transient 
stoppage from those that did not and found the rate of DNA synthesis 
and fork bypass were correlated (r = 0.62; Extended Data Fig. 4a); fast 
forks were arrested less often than slower ones. In fact, there was a 
twofold difference in average rates of synthesis at forks that stopped 
and those that bypassed TerB (Fig. 1g). DNA synthesis at individual 
forks before stoppage at TerB, or in full trajectories where they 
bypassed it, progressed at nearly constant rates under our spatial 
and temporal resolution (Extended Data Fig. 4b-e). As we showed 
previously’, the overall average rate reproduces the average in vivo 
rate (~950 bps _')'8. This underscores the significance of our ability to 
achieve the in vivo rate of DNA synthesis to reproduce the ~50% 
efficiency of fork arrest in vivo. 

The rate dependence of stoppage supports the hypothesis that 
strand separation competes with inefficient C6 flipping. To dem- 
onstrate this, we pre-formed the locked complex before replisome 
assembly using TerB with a mismatched bubble in place of base pairs 
3—7 while keeping an unpaired C6 (Fig. 1h and Table 1)°. The yield of 
fork arrest increased to 89%; thus, once the lock is established, it is a 
very effective fork block. 

We next interrogated the role of lock formation with C6-defective 
TerB mutants’. Surprisingly, the GC(6) to CG substitution did not lead 
to ~95% bypass. Instead, it resulted in transient (for 37 + 6s) rather 
than permanent blockage, again in ~50% of trajectories (Fig. 2a, b and 
Extended Data Fig. 2c). Moreover, the fork rate dependence of pausing 
was similar to the normal lock (Fig. 2c). DnaB remained at the fork 
during transient arrest since DNA synthesis could restart in the 
absence of helicase reloading proteins (Table 1). The crystal structure 
ofa Tus complex with a forked Ter containing an unpaired G replacing 
C6 showed that the substituted G6 base neither bound in the cytosine 
pocket nor formed any new specific interaction with Tus (Fig. 2d and 
Extended Data Table 1). This remained the case even when the fork 
was extended to also disrupt the TA(7) base pair (Extended Data 
Fig. 5a-c). Thus the fork-rate-dependent step producing transient 
stoppage must precede engagement of C6 in its binding pocket. 

In the Tus crystal structures, the «6/L3/a7 region has extensive 
interactions with the lagging strand (Fig. 1b) before and after C6 lock 
formation®"', providing a paradox about how the lagging-strand- 
translocating DnaB in fast-approaching replisomes disrupts these 
interactions without even pausing. The main sequence-specific contact 
this region makes with the first 6 bp of dsTer are via Arg198 in L3 with 
the A5 and G6 bases on the lagging strand and T5 on the leading strand 
(Extended Data Fig. 5e)'’, but these interactions are not present in the 
locked complex, where Arg198 makes a new salt bridge to the phos- 
phate between lagging strand nucleotides 6 and 7 (Extended Data Fig. 
5a)°. We suggest that the Arg198 side chain forms transient interac- 
tions with G6, the TA(5) base pair and the lagging-strand phosphate, 
holding the two DNA strands together before strand separation 
reaches GC(6) (Extended Data Fig. 5e). Moreover, comparison of 
the structures of Tus with the wild type’? and CG(6) mutant dsTer 
sites (Extended Data Table 1) suggests rearrangement of lagging strand 
interactions with Arg198 (Extended Data Fig. 5e, f). We propose that 
Arg198-DNA contacts rearrange substantially during strand separa- 
tion. This provides a window of opportunity for the fast-moving DnaB 
to break into the Tus-Ter central interactions before Argl98 
rearrangement or C6 base flipping occurs. 


396 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


CG(6)-NPTerB 

Pause duration 
=37+6s 
(N = 16) 


DNA synthesis (kbp) 


Number of events 
o- NWOHBAD N 


1 4 1 n 1 n 
40 60 80 100 
Pause time (s) 


Oo 
np 
is) 


0 50 100 150 200 250 
Time (s) 


sv 


Stop/restart Bypass 


gS : Rate = Rate = 
S100 5 Qrae-» Lagging . 720 = 80 bps" 4,740 60 bps" 
g . 3 Leading 2 7 (N = 20) (N= 18) 
o me @ Bypass © 6 

3 CG(6)-NPTerB og Stop 3 é 

«= 60 Bw Restart Cx 

ie} Oo 4 

& 40 83 

sg (N = 38) e 

5 ae 

Ss 20 Z 4 

oO 

a 9 iy 0 

d 


0 1,000 2,000 
Rate (bp s“) 


i fame 
—— Leading 


5-mismatch G6-NPTerB 


3,000 


Permissive Non-permissive 


© 


binding 
pocket 


= 
i=] 
o 


C “9 80 @ Bypass 
‘ ’ B Stop 
a wy TL3 60 w Restart 
C/T 
40 (N = 26) 


Percentage of events (%) 
nm 
Oo 


i ois 


Figure 2 | Characterization of transient stoppage of the replication fork at 
the non-permissive face of Tus-TerB before C6 base flipping. 

a, Representative trajectories of restart of DNA synthesis after transient 
stoppage at a Tus-bound TerB site where the GC(6) base pair was swapped to 
CG(6) (CG(6)-NPTerB). The distribution of pause durations was fit with a 
single exponential decay; the fit line is shown and uncertainty corresponds to 
the standard error (N = 16). b, Percentages of the populations of replication 
forks that bypassed, transiently stopped or fully stopped at CG(6)-NPTerB. 
Error bars correspond to standard deviations of binomial distributions 

(N = 38). c, Rate dependence of replication restart at CG(6)-NPTerB. The rate 
distribution of leading-strand synthesis for events that bypassed (grey; N = 18) 
or stopped/restarted (blue bars; N = 20) at CG(6)-NPTerB are presented as in 
Fig. 1g. d, Crystal structure of Tus with a forked Ter sequence that has a 
substituted G base in the C6 position in the locked complex (see also Extended 
Data Fig. 5b). The G base, highlighted in blue, was neither docked into the 
cytosine-binding pocket nor forming any new interactions with Tus. 
Highlighted nucleotides (at bottom) were not visible in the structure. e, Fates of 
replication forks at Tus bound to the NP face of a TerB site containing a 
bubbled-DNA structure in place of base pairs 3—7 in TerB and with G replacing 
C6 (5-mismatch G6-NPTerB). Error bars correspond to standard deviations of 
binomial distributions (N = 26). 


ce) YS f 4 
= J 
4a 


'T 
5’ BTAGTTACAACATAS 
3’ WTCAATGTTGTATTAM 


Oo 


To test this proposition, we used a bubble substrate with altered C6 
(5-mismatch G6-NPTerB; see Table 1) to eliminate lock formation® 
but allow rearrangement of interactions on the separated strands 
before arrival of DnaB. We observed efficient transient stoppage that 
reached 77% with a long duration of 177 + 20 s (Fig. 2e; Extended Data 
Fig. 6). The fivefold increase in pause duration compared to CG(6)- 
NPTerB (Table 1) is probably owing to the interactions of the unpaired 
seventh nucleotide as in the locked complex structure’. Thus, strand 
separation beyond GC(6) in the absence of the C6 lock would impose 
only transient fork stoppage. 

We next altered A5 and T5 alone and in the context of the first 5 bp 
of TerB (Table 1); this resulted in the largest decrease in yield of 
stoppage and shift in the rate-dependence of arrest to lower values, 


©2015 Macmillan Publishers Limited. All rights reserved 


R198 makes base-specific | Slow fork (stop) 


interactions with AS and G6 === 


| Fast fork (bypass) 


Probability A 


DnaB breaks into Tus-Ter central 
interaction and displaces Tus 


R198 fails to rearrange its interactions 
with AS and G6 to block DnaB 


Figure 3 | Model of Tus- Ter polar arrest activity at the non-permissive face. 
Prior to strand separation, Arg198 makes base-specific contacts with A5 and G6 
on the lagging strand to protect Tus-Ter central interactions from the DnaB 
helicase. After separation of the first six base pairs, Arg198 maintains contacts 
with the lagging strand by rearranging its interactions to make a new salt bridge 
to the phosphate between A5 and G6 and a new unidentified base-specific 
interaction is induced with T5. Competition between rates of strand separation 
and rearrangement of Arg198 interactions determines Tus-Ter efficiency. 


underscoring that A5 and/or T5 are the primary contributors in this 
region to the rate dependence (Extended Data Fig. 7a-c and 
Supplementary Discussion 4). AT(5) is conserved in strong Ter sites 
but not at weaker ones like TerH’. We also found that the NP Tus- 
TerH complex stops the forks with a similarly low rate dependence to 
TA(5)-NPTerB (Table 1). However, one-third of the stopped forks 
restarted synthesis after a pause of 180 + 26s (Extended Data Fig. 
7g, h), which we attribute to other alterations in TerH that weaken its 
binding in its locked form”. 

Substitution of wild-type GC(6) with CG has a modest effect on Tus 
binding to dsTerB in comparison to an AT or TA? We observed that, 
relative to CG(6)-NPTerB, the TA substitution resulted in transient 
fork stoppage with decreased yield (Table 1 and see Fig. 2a—c; Extended 
Data Fig. 7d-f), demonstrating the importance of the specific inter- 
action of Tus with the native G6 for transient stoppage. 

We then altered Arg198 itself. The R198A mutant interacts with 
dsTerB with a 140-fold increased Kp, but only a twofold shorter life- 
time”. We showed by surface plasmon resonance (SPR) that 
Tus(R198A) can form a lock (Extended Data Fig. 8a—-d), but it was 
very defective in fork arrest (Table 1); stoppage was inefficient (18%) 
and transient (pauses of 14 + 4s; N = 4). Nevertheless, preforming the 
locked complex with R198A on the 5-mismatch C6-NPTerB substrate 
restored efficient stoppage (Table 1), consistent with lock formation. 
These results suggest that C6 flipping cannot occur unless Arg198 
interactions slow down or transiently stop the fork beforehand. 


Infrequent 


LETTER 


Probability B 
Fork stops before GC(6) is melted (permanent stoppage) 


R198 maintains its interactions with AS and G6 


Probability C 
Step 1: Fork stops after GC(6) is melted (transient stoppage) 
3/ 


- T5 makes new base- 
specific contact 


le AS and G6 rearrange 
their interactions 


R198 rearranges its interactions 

(contacts the phosphate between A5 

and G6) faster than DNA unwinding 
Step 2: C6 base flipping (permanent stoppage) 


3’ 


Tus-Ter lock 


Faster moving forks have higher probability to separate GC(6) before 
rearrangement of Arg198 interactions, leading to effective displacement of Tus 
(probability A). The slower forks are either stopped permanently before or 
during GC(6) melting (probability B) or transiently if GC(6) is melted and 
Arg198 succeeds in rearranging its interactions (probability C, step 1). The C(6) 
mousetrap acts as a terminal step that is enabled by the transient stoppage to 
impose permanent fork arrest (probability C, step 2). 


So we have by now revealed two separate processes, one leading to a 
transient stoppage preceding but probably on the pathway to C6 lock 
formation, and one that leads directly to bypass and Tus dissociation. 
Previous results have suggested the operation of an uncharacterized 
C6-lock-independent arrest mechanism’. Our study shows that this 
mechanism must be invoked before or as GC(6) is melted, because 
permanent stoppage was not achieved when the GC(6) was melted in 
the absence of the C6 lock (Fig. 2e). To explore whether the interac- 
tions of Arg198 with AT(5) and G6 contribute directly to this alternate 
mechanism, we maintained these interactions using the TerB sequence 
and deactivated the C6 lock using Tus(H144A), the key residue in the 
binding pocket. This mutation completely eliminated lock formation 
(SPR in Extended Data Fig. 8e, f; X-ray structure in Extended Data 
Fig. 5d). However, we still observed a high level (27%) of permanent 
fork arrest, confirming existence of a lock-independent process leading 
to permanent stoppage. There were also significant restarts (18%; 
Table 1) after short pauses (33 + 5s; N = 6). Pausing must result from 
a mechanism additional to permanent arrest, since restarts would 
otherwise be randomly distributed over the full 10 min period of obser- 
vation. The rate-dependence of arrest was similar to wild-type Tus 
(Table 1 and Extended Data Fig. 9). 

Collectively, our results show that interactions of Arg198 of Tus with 
G6, A5 and/or T5 act to protect Tus-Ter central interactions from the 
first arriving DnaB. Nevertheless, these gatekeeping interactions are 
dynamic during separation of the first 6 bp and their rearrangement 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 397 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


occurs in competition with strand separation. We suggest that faster 
forks have higher probability to separate GC(6) before rearrangement 
of Arg198 interactions, displacing Tus without pausing (Fig. 3, probabil- 
ity A). Slower forks are either stopped permanently before GC(6) melt- 
ing (probability B) or transiently if GC(6) is melted and Arg198 succeeds 
in rearranging (probability C). The inefficient C6 mousetrap is a ter- 
minal step, enabled by transient stoppage to impose permanent fork 
arrest (probability C). These results provide an explanation of why, in 
helicase assays, the slowly moving DnaB (35—390 bps‘)??? is effi- 
ciently stopped at the NP face without requiring C6 flipping”’. 

Thus, we refine the mousetrap model and redefine the efficiency of 
Tus-Ter polar arrest to depend on collective contributions of intrinsic 
affinity of Tus for Ter, stability of the flipped C6 in its binding pocket, 
and rate-dependent induction of fork stoppage that fully or temporarily 
protects Tus-Ter central interactions from DnaB. Our observations 
also raise a question about how weaker Ter sites evolved to block slower 
forks (Supplementary Discussion 5). The encounter of dsDNA-binding 
proteins with motor proteins like helicases and polymerases is a com- 
mon feature in replication, repair, recombination and transcription 
and where conflict among these processes arises (reviewed in ref. 24). 
We show for the first time that intrinsic heterogeneity in rates of 
individual molecular motors can have different biological outcomes 
as they communicate with dsDNA-binding proteins and other barriers 
(Supplementary Discussion 6). 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 21 January; accepted 26 June 2015. 
Published online 31 August 2015. 


1. Dalgaard,J.Z. etal. Random and site-specific replication termination. Methods Mol. 
Biol. 521, 35-53 (2009). 

2. Hill, T.M., Henson, J.M. & Kuempel, P. L. The terminus region of the Escherichia coli 
chromosome contains two separate loci that exhibit polar inhibition of replication. 
Proc. Natl Acad. Sci. USA 84, 1754-1758 (1987). 

3. Coskun-Ari, F.F. & Hill, T. M. Sequence-specific interactions in the Tus-Ter complex 
and the effect of base pair substitutions on arrest of DNA replication in Escherichia 
coli. J. Biol. Chem. 272, 26448-26456 (1997). 

4. Neylon, C., Kralicek, A. V., Hill, T. M. & Dixon, N. E. Replication termination in 
Escherichia coli: structure and antihelicase activity of the Tus-Ter complex. 
Microbiol. Mol. Biol. Rev. 69, 501-526 (2005). 

5. Duggin, |. G., Wake, R. G., Bell, S. D. & Hill, T. M. The replication fork trap and 
termination of chromosome replication. Mol. Microbiol. 70, 1323-1333 (2008). 

6. Mulcair, M. D. et al. A molecular mousetrap determines polarity of termination of 
DNA replication in E. coli. Cell 125, 1309-1319 (2006). 

7. Kaplan, D. L. & Bastia, D. Mechanisms of polar arrest of a replication fork. Mol. 
Microbiol. 72, 279-285 (2009). 

8. Valjavec-Gratian, M., Henderson, T. A. & Hill, T. M. Tus-mediated arrest of DNA 
replication in Escherichia coli is modulated by DNA supercoiling. Mol. Microbiol. 58, 
758-773 (2005). 

9. Duggin,|.G.& Bell, S. D. Termination structures in the Escherichia coli chromosome 
replication fork trap. J. Mol. Biol. 387, 532-539 (2009). 

10. Coskun-Ari, F. F., Skokotas, A. Moe, G. R. & Hill, T. M. Biophysical characteristics of 
Tus, the replication arrest protein of Escherichia coli. J. Biol. Chem. 269, 
4027-4034 (1994). 


398 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


11. Kamada, K., Horiuchi, T., Ohsumi, K., Shimamoto, N. & Morikawa, K. Structure of a 
replication-terminator protein complexed with DNA. Nature 383, 598-603 
(1996). 

12. Lee, J. B. et al. DNA primase acts as a molecular brake in DNA replication. Nature 
439, 621-624 (2006). 

13. Tanner, N.A. etal. Single-molecule studies of fork dynamics in Escherichia coli DNA 
replication. Nat. Struct. Mol. Biol. 15, 170-176 (2008). 

14. Jergic,S. etal. A direct proofreader-clamp interaction stabilizes the Pol Ill replicase 
in the polymerization mode. EMBO J. 32, 1322-1333 (2013). 

15. Tanner, N. A. et al. Real-time single-molecule observation of rolling-circle DNA 
replication. Nucleic Acids Res. 37, e27 (2009). 

16. Hamdan, S. M., Loparo, J. J., Takahashi, M., Richardson, C. C. & van Oijen, A. M. 
Dynamics of DNA replication loops reveal temporal control of lagging-strand 
synthesis. Nature 457, 336-339 (2009). 

17. Yao, N. Y., Georgescu, R. E., Finkelstein, J. & O’Donnell, M. E. Single-molecule 
analysis reveals that the lagging strand increases replisome processivity but 
slows replication fork progression. Proc. Nat! Acad. Sci. USA 106, 13236-13241 
(2009). 

18. Pham, T. M. etal. Asingle-molecule approach to DNA replication in Escherichia coli 

cells demonstrated that DNA polymerase III is a major determinant of fork speed. 

Mol. Microbiol. 90, 584-596 (2013). 

19. Moreau, M. J. & Schaeffer, P. M. Differential Tus-Ter binding and lock formation: 

implications for DNA replication termination in Escherichia coli. Mol. Biosyst. 8, 

2783-2791 (2012). 

20. Neylon, C. et al. Interaction of the Escherichia coli replication terminator protein 

(Tus) with DNA: a model derived from DNA-binding studies of mutant proteins by 

surface plasmon resonance. Biochemistry 39, 11989-11999 (2000). 

21. Bastia, D. et al. Replication termination mechanism as revealed by Tus-mediated 

polar arrest of a sliding helicase. Proc. Nat! Acad. Sci. USA 105, 12831-12836 

(2008). 

22. Ribeck, N., Kaplan, D.L., Bruck, |. & Saleh, O. A. DnaB helicase activity is modulated 

by DNA geometry and force. Biophys. J. 99, 2170-2179 (2010). 

23. Kim, S., Dallmann, H. G., McHenry, C. S. & Marians, K. J. Coupling of a replicative 

polymerase and helicase: a t-DnaB interaction mediates rapid replication fork 
movement. Cell 84, 643-650 (1996). 

24. Finkelstein, |. J. & Greene, E. C. Molecular traffic jams on DNA. Annu. Rev. Biophys. 
42, 241-263 (2013). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank A. van Oijen for critical comments and the groups of 
N. Dekker and S. Patel for helpful discussions. This research was supported by the King 
Abdullah University of Science and Technology through core funding (to S.M.H.) and a 
Faculty Initiated Collaborative Award (to S.M.H. and N.E.D.), and by the Australian 
Research Council (DP0877658 to N.E.D. and AJ.0.; DP0984797 to N.E.D.), including 
an Australian Professorial Fellowship to N.E.D. and a Future Fellowship (FT0990287) to 
A.J.0. X-ray crystallographic data were collected at the Australian Synchrotron, Victoria, 
Australia. 


Author Contributions M.M.E. designed and carried out the single-molecule replication 
assays; M.M.E., M.A.S. and M.T. established the single-molecule replication assays; SJ. 
designed and carried out SPR measurements; S.J. and Z.-Q.X. isolated proteins; Z.-Q.X. 
and A.J.O. crystallized complexes, collected X-ray data and refined crystal structures. 
M.M.E., S.J., N.E.D. and S.M.H. designed the research and wrote the article. All authors 
analysed the data, discussed the results and commented on the manuscript. 


Author Information Atomic coordinates and structure factors for the reported crystal 
structures have been deposited at the Protein Data Bank under accession codes 4XRO 
(Tus-UGLT fork), 4XR1 (Tus-TGTA fork), 4XR2 (H144A-WT fork) and 4XR3 
(Tus-UGLC). Reprints and permissions information is available at www.nature.com/ 
reprints. The authors declare no competing financial interests. Readers are welcome to 
comment on the online version of the paper. Correspondence and requests for 
materials should be addressed to N.E.D. (nickd@uow.edu.au) or S.M.H. 
(samir.hamdan@kaust.edu.sa). 


©2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


No statistical methods were used to predetermine sample size, the experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Protein expression and purification. Described methods were used to prepare 
N-terminally Hisg-tagged Tus”? and its mutant derivatives Tus(R198A)*° and 
Tus(H144A)°, as well as the following E. coli DNA replication proteins: the B, sliding 
clamp”, the Pol III 1358’y clamp loader and a¢9 core’’, the DnaBg(DnaC)¢ heli- 
case/loader complex"* and the fork restart proteins PriA, PriB and DnaT”. 
Crystallization of Tus- Ter complexes and data collection. Four crystal struc- 
tures of Tus-Ter complexes are reported (Extended Data Table 1; the oligonucleo- 
tide sequences and proteins used are given in Extended Data Fig. 5). All complexes 
(finally at 4-5 mg ml~' protein) were prepared with a slight excess of DNA in 
10 mM Bis-Tris, pH 6.5, 1mM EDTA, 2 mM dithiothreitol, and excess DNA was 
removed by using a centrifugal ultrafiltration device, as described previously*. 
Crystals were grown using the vapour diffusion (hanging drop) method at 23 °C. 
The protein-DNA complex (3 pl) was mixed with an optimized reservoir solution 
(3 ul) consisting of 8-12% PEG 3350, 0.1-0.2 M Nal, 50 mM Bis-Tris, pH 6.2-6.8. 
Crystals appeared after 2 days and reached maximum size after 10 days. The pH 
was measured using 1 M stock solutions of buffer before addition to the reservoir. 
Optimized reservoir solutions for the four crystals contained: for Tus-UGLT fork 
(forked Ter with C6 to G change), 12% PEG 3350, 0.2 M Nal, 50 mM Bis-Tris pH 
6.2; for Tus-TGTA fork (forked Ter with C6 to G change and the fork extended to 
position 7), 9% PEG 3350, 0.2 M Nal, 50 mM Bis-Tris pH 6.8; for Tus(H144A)- 
WT fork (‘wild-type’ forked Ter with Tus(H144A)), 8% PEG 3350, 0.1 M Nal, 
50mM Bis-Tris pH 6.2; for Tus-UGLC (dsTerA with GC(6) to CG flip), 12% 
PEG 3350, 0.2 M Nal, 50 mM Bis-Tris pH 6.5. 

All X-ray data were collected at the Australian Synchrotron beamline MX-1 (X- 
ray wavelength, 0.95370 A) using an Oxford cryostream to maintain the crystal 
temperature at 100K. Prior to cooling, crystals were transferred stepwise into 
artificial mother liquors finally containing 15% (v/v) MPD (2-methyl-2,4-penta- 
nediol) in 3% increments of MPD (3 min per step). Data were collected using an 
ADSC Quantum 210r area detector, using BLU-ICE for remote data acquisition 
and processing”. Data reduction and scaling was achieved with the HKL2000 
package**. 

Structure determination and refinement. All structures were solved by molecu- 
lar replacement in MOLREP” using a previously solved Tus-Ter lock (PDB code: 
2106) or Tus-TerA structure (2105) as starting model. REFMAC***! was used for 
structure refinement and calculation of map weighting factors. COOT” was used 
to interpret electron density maps and for model building. Figures were prepared 
using PyMOL*. 

Assessment of Tus— TerB interactions by surface plasmon resonance (SPR). 
Methods were essentially as used previously*”®, except that all experiments were 
carried out at 20 °C (instead of 25 °C) anda6 X 6 multiplex BioRad ProteOn XPR- 
36 system was used instead of a Biacore 2000 instrument; dissociation rate con- 
stants (kg) of Tus proteins from immobilized biotinylated TerB showed an unusual 
temperature dependence (high activation energy), which accounts for lower values 
of kg and the dissociation constant (Kp) compared to previously reported values 
(where they are available*”®). 

All measurements used SPR buffer (50 mM Tris pH 7.6, 250 mM KCl, 0.25 mM 
EDTA, 0.5 mM dithiothreitol, 0.005% surfactant P29), with a ProteOn NLC (neu- 
travidin-coated) sensor chip for immobilization of 5’-biotinylated TerB oligodeox- 
yribonucleotides (oligos). These were either (1) 5’-bio-(pD),>-ATAAGTATGT 
TGTAACTAAAG, oligo-1, or (2) 5’-bio-(pD);y>-GGGGCTATGTTGTAACTA 
AAG, oligo-2, each containing a 10-unit abasic deoxyribosephosphate spacer 
(pD)i9 to move the TerB molecule away from the chip surface”, as well as a 
common lagging strand TerB sequence (underlined)”. Hybridization of oligo-3: 
5'-CTTTAGTTACAACATACTTAT (C6 of Ter in bold) to oligo-1 produces a full 
dsTerB site, while its hybridization to oligo-2 produces a forked Ter where C6 is 
unpaired and exposed (mismatched sequences in oligos-2 and -3 are in italics)®. 

All 36 interaction spots of the sensor chip were activated with three sequential 
injections of 1M NaCl, 50mM NaOH across six vertical (ligand) flow paths 
(40s each at 40 ul min ') and six horizontal (analyte) flow paths (40s each at 
100 pl min~'). The surface was further stabilized by two injections of 1 M MgCl in 
each direction, with the same contact times and flow rates. Oligos-1 and -2 were 
diluted to 200 nM in SPR buffer and immobilized separately onto the six inter- 
action spots of the vertical flow path (100 pl min“! for 15s). The chip was then 
rotated 90° and simultaneous assembly of dsTerB and forked Ter templates® on the 
chip surface was achieved by hybridization of oligo-3 (300 nM), made to flow 
across all six horizontal (analyte) channels at 25 pl min | for 400s. The sensor- 
gram verified that hybridization went to completion. After subsequent injection of 
a concentration series of Tus, the surface was regenerated; remaining proteins and 


LETTER 


hybridized DNAs were removed by two injections of 1 M NaCl, 50 mM NaOH 
over the six analyte channels at 50 tl min! for 40s, followed by re-hybridization 
of oligo-3 as above. Measured stoichiometries of Tus binding to both templates 
were close to 1:1 at saturation, as reported previously*””. 

Tus, Tus(R198A) and Tus(H144A) interactions with TerB and forked Ter 
templates were carried out by sequential injections in the analyte direction of 
one or two appropriate concentration series in SPR buffer (zero and five concen- 
trations of serially diluted samples) at 40 pl min’ for 300s, followed by dissoci- 
ation in the same buffer over 2,000 s. The final sensorgrams were interspot and 
unmodified ligand flow path subtracted using ProteOn Manager Software 
(v. 3.1.0.6) and then zero subtracted and normalized based on the highest response 
of hybridized oligo within the discrete ligand flow path using BlAevaluation 
software (v. 4.0.1; Biacore AB, Sweden). Equilibrium (dissociation constant, Kp) 
and kinetic (rate constants, k, and kg) parameters for the binding of Tus proteins to 
the Ter fragments were determined by global (simultaneous) fitting of at least five 
sensorgrams per measured interaction from the optimized concentration range 
using BlAevaluation software and the appropriate interaction model(s): 
(Langmuir) 1:1 binding with mass transfer model (LMT, for Tus-dsTerB inter- 
action; as previously done”), (Langmuir) 1:1 binding model (L, for Tus— and 
Tus(R198A)—forked Ter), and 1:1 steady state affinity (LSS) and heterogeneous 
ligand-parallel reactions (HLPR) binding models for fitting sensorgrams that 
reached an equilibrium response (Tus(R198A)—dsTerB, Tus(H144A)—dsTerB 
and Tus(H144A)—forked Ter). 

Global best fits were used when LSS and HLPR models were used. When L and 
LMT models were used, the fitting was constrained by setting the Rnax to a global 
constant value (response at saturation of ligand binding sites was set to 700 
response units (RU) for bindings to dsTerB and 775 RU for forked Ter). These 
values, calculated theoretically as a product of the highest measured response of 
hybridized oligo-3 (molecular weight 6,354; used as a normalization unit) onto 
oligo-1 (120 RU) and oligo-2 (134 RU) and the factor 5.8 (molecular weight of 
Hiss-Tus/molecular weight of hybridized oligo-3 = 36,737/6,354), were com- 
pared with experimentally determined values obtained by flowing Tus at a sat- 
urating concentration (1.024 1M) over the two DNA templates (not shown). In 
addition, due to slow dissociation, experimentally determined kg values for Tus- 
and Tus(R198A)-forked Ter interactions using the L model were assessed by 
comparison with the kg values determined from the experiment where dissoci- 
ation was monitored over 50,000 s (not shown). To generate as reliable as possible 
values for kinetic parameters using the HLPR model, k, and kq were estimated in 
the first approximation based on complete association phase and only an initial 
phase of dissociation where the rate of change is the greatest. These obtained values 
of kinetic parameters were sometimes used as initial iterative values; otherwise, the 
iterations could slip into local minima without reaching a sensible solution. Only 
the fit kinetic parameters of the prevalent (dominant) reaction using HLPR model 
were finally presented in Extended Data Fig. 8g. For assessment, Kp values calcu- 
lated from obtained kinetic parameters (kg/k,) were compared with Kp values 
directly obtained using the LSS model (Extended Data Fig. 8g). 

Single molecule flow stretching assays: DNA substrate constructs. Bacteriophage 
2. DNA was modified by ligating a biotinylated fork on one end and a digoxigenin 
moiety at the other end as described previously'*. This ligated product was 
digested with either EcoRI or Apal to generate 3.6- and 10.1-kb fragments from 
the forked and digoxigenin ends, respectively. An oligonucleotide sequence con- 
taining a single copy of wild type or variants of the TerB site was ligated to the 
digested ends of the 3.6- and 10-kb fragments as described previously** to gen- 
erate DNA constructs with variant TerB sites that are listed in Table 1 and 
Extended Data Fig. 1b. 

Force calibration. A force extension curve was constructed by measuring the 
length of individual 13.7 kb DNA molecules and calculating the hydrodynamic 
drag force at different flow rates using the equipartition theorem equation as 
described previously***°. The force extension curve was fit using Worm-like chain 
model****. The fluctuation in the laminar flow causes an error in estimating the 
force and consequently the length of individual DNA molecules, which results in 
an error in estimating the location of the TerB site relative to the fork. At the 
applied stretching force of 2.6pN in our experiments, the error in estimating 
the force, derived from the standard deviation among seven DNA molecules in 
the same field of view, results in an error of + ~85 bp in estimating the position 
of the TerB site at 3.6 kb from the site of fork assembly**. Consequently, we treated 
any replication event ending between 3.5 and 3.7 kb as being stopped at the TerB site. 
Single-molecule leading-strand synthesis assay. The leading strand DNA syn- 
thesis and data analysis were performed as described previously'*"* with the 
variation of adding Tus to the reaction. Briefly, Tus was first introduced under 
continuous flow at 80nM in buffer containing 30mM Tris-HCl pH 7.6, 
50 mM NaCl, 0.5mM EDTA, 5 mM dithiothreitol and 10 mM MgCl, for 30 min 
to ensure the binding of Tus to TerB. The excess DNA-unbound Tus was removed 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


by washing with 15 times flow cell volume with replication buffer containing 
50mM HEPES-KOH pH 7.9, 80mMKCI, 12mMMg(OAc)., 2mM MgCh, 
5 mM dithiothreitol and 0.1 mgml’ BSA. Tus was then reintroduced with the 
replication proteins under continuous flow in the replication buffer supplemented 
with 760 [1M of each dNTP, 1 mM ATP and proteins as follows: 80 nM Tus, 30 nM 
1366’x, 30 nM DnaB,(DnaC), helicase-loader complex, 30 nM 8, clamp, 60 nM 
a0 core Pol III, and fork restart proteins, 20nM PriA, 40nM PriB and 480nM 
DnaT. Experiments were carried out at 32 °C. 

For data analysis, the picked particles were first corrected for their Brownian 
motion using unreplicated tethered DNA molecules. Pausing of DNA synthesis 
was considered when the amplitude fluctuations of a minimum of six data points 
(acquisition rate was 2 Hz) was less than three times the standard deviation of the 
noise. Bead displacement was converted into numbers of nucleotides synthesized 
using the known length difference between ss- and dsDNA in A DNA® (3.76 bases 
per nm at our applied stretching force of 2.6 pN). Total experimental time was 30 
min. In the study of the effect of Tus concentration on Tus-TerB polar arrest 
activity (Fig. 1f), Tus was first pre-incubated with the DNA at 80 nM and excess 
Tus was washed out as described above for our standard experimental condition. 
This was followed by the introduction of either 20 or 80nM of Tus with the 
replication proteins. Tus(H144A) was used at concentration of 80nM while 
Tus(R198A) was used at 250nM throughout the reaction. Multiplexed single- 
molecule experimental results were derived from three or four technical replicates 
for each experimental condition. 

The portion of leading strand synthesis trajectories that randomly terminated 
before reaching the position of the Ter site at 3.6 + 0.1 kb (Extended Data Fig. 3a) 
were excluded from analysis. Those that reached the Ter site were separated into 
three categories: (1) those that continued unimpeded through the Ter site 
(‘bypass’), (2) those that were ‘permanently’ arrested for all of the period of 


observation (9.7 + 1.0 min; ‘stop’), and (3) those that paused (for = 3 s, see above) 
and then resumed (‘restart’). 


25. Oakley, A.J. etal. Flexibility revealed by the 1.85 A crystal structure of the B sliding- 
clamp subunit of Escherichia coli DNA polymerase Ill. Acta Crystallogr. D 59, 
1192-1199 (2003). 

26. Marians, K. J. pX174-type primosomal proteins: purification and assay. Methods 
Enzymol. 262, 507-521 (1995). 

27. McPhillips, T. M. etal. Blu-lce and the Distributed Control System: software for data 
acquisition and instrument control at macromolecular crystallography beamlines. 
J. Synchrotron Radiat. 9, 401-406 (2002). 

28. Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in 
oscillation mode. Methods Enzymol. 276, 307-326 (1997). 

29. Vagin, A. & Teplyakov, A. MOLREP: an automated program for molecular 
replacement. J. Appl. Crystallogr. 30, 1022-1025 (1997). 

30. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. Refinement of macromolecular 
structures by the maximum-likelihood method. Acta Crystallogr. D 53, 240-255 
(1997). 

31. Winn, M. D., lsupov, M. N. & Murshudov, G. N. Use of TLS parameters to model 
anisotropic displacements in macromolecular refinement. Acta Crystallogr. D 57, 
122-133 (2001). 

32. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of 
Coot. Acta Crystallogr. D 66, 486-501 (2010). 

33. The PyMOL Molecular Graphics System, Version 1.5.0.4, Schrédinger, LLC. 

34. Pandey, M. et al. Two mechanisms coordinate replication termination 
by the Escherichia coli Tus-Ter complex. Nucleic Acids Res. 43, 5924-5935 
(2015). 

35. van Oijen, A. M. et al. Single-molecule kinetics of lambda exonuclease reveal base 
dependence and dynamic disorder. Science 301, 1235-1238 (2003). 

36. Bustamante, C., Marko, J. F., Siggia, E. D. & Smith, S. Entropic elasticity of lambda- 
phage DNA. Science 265, 1599-1600 (1994). 


©2015 Macmillan Publishers Limited. All rights reserved 


3.6 kb 
(.. DNA - EcoRI digest) 


biotin 
streptavidin 


inert and 
biotinylated PEG 


67 bp containing TerB sequence 


LETTER 


Flow 
a <a 
a <a 
a <a 
10.1 kb een 
(4 DNA- Apal digest) digoxigenin 


anti digoxigenin 


magnetic 
bead 


ii il a a a a a i i a tS a a are 


glass coverslip 


b Ter variant substrate / oligonucleotide sequences 


PTerB 


leading (<Pol) 


5’ -CAAGTCACCACGACTGTGCTATAA, GGTTAATATTATGGCGCGTT-3" 
3’ -CCGGGTTCAGTGGTGCTGACACGATAT' .CCAATTATAATACCGCGCAATTAA- 5’ 


NPTerB 


P  jagging (<DnaB) 


leading (< Pol) 


6 1 
5’ -AACGCGCCATAATATTAACC, TTATAGCACAGTCGTGGTGACTTG- 3’ 
3’ -CCGGTTGCGCGGTATTATAATTGG AATATCGTGTCAGCACCACTGAACTTAA- 5’ 


CG(6)-NPTerB 


NP lagging (< DnaB) 


5’ -AACGCGCCATAATATTAACC, TTATAGCACAGTCGTGGTGACTTG- 3’ 
3’ -CCGGTTGCGCGGTATTATAATTGG C. AATATCGTGTCAGCACCACTGAACTTAA- 5’ 


TA(6)-NPTerB 


5’ -AACGCGCCATAATATTAACC TATAGCACAGTCGTGGTGACTTG - 3’ 
3’ -CCGGTTGCGCGGTATTATAATTGG AATATCGTGTCAGCACCACTGAACTTAA - 5’ 


TA(5)-NPTerB 


5’ -AACGCGCCATAATATTAACC! TTATAGCACAGTCGTGGTGACTTG - 3’ 
3’ -CCGGTTGCGCGGTATTATAATTGG T. AATATCGTGTCAGCACCACTGAACTTAA- 5’ 


5-mismatch C6-NPTerB 


5’ -AACGCGCCATAATATTAACC. 'TTATAGCACAGTCGTGGTGACTTG - 3’ 
3’ -CCGGTTGCGCGGTATTATAATTGG! TCC: TATCGTGTCAGCACCACTGAACTTAA- 5’ 


5-mismatch G6-NP TerB 


5’ -AACGCGCCATAATATTAACC TATAGCACAGTCGTGGTGACTTG - 3’ 
3’ -CCGGTTGCGCGGTATTATAATTGG! TCCA’ AATATCGTGTCAGCACCACTGAACTTAA- 5’ 


Swapped F4n GC(6)-NPTerB 


5’ -AACGCGCCATAATATTAACC AAT TATAGCACAGTCGTGGTGACTTG - 3’ 
3’ -CCGGTTGCGCGGTATTATAATTGG! AATTAATATCGTGTCAGCACCACTGAACTTAA- 5’ 


Swapped F5n GC(6)-NPTerB 


5’ -AACGCGCCATAATATTAACC 
3’ -CCGGTTGCGCGGTATTATAATTGG! 


NPTerH 


TAATTATAGCACAGTCGTGGTGACTTG - 3’ 
TTATTAATATCGTGTCAGCACCACTGAACTTAA- 5’ 


leading (<-Pol) 


6 1 
5’ -AACGCGCCATAATATTAACC TATAGCACAGTCGTGGTGACTTG - 3’ 
3’-CCGGTTGCGCGGTATTATAATTGG TATCGTGTCAGCACCACTGAACTTAA- 5’ 
NP 


Extended Data Figure 1 | Setup for leading-strand replication assays. a, A 
schematic representation of the 13.7 kb DNA substrate construct. The substrate 
contains a biotinylated fork at one end to attach it to the streptavidin-coated 
glass coverslip and a digoxigenin moiety at the other end to attach it to a 2.8 um 
diameter anti-digoxigenin-coated paramagnetic bead. A single insert of TerB 
site is located at 3.6 kb from the biotinylated fork. b, Oligonucleotides used to 
assemble wild-type and variants of TerB substrates for their ligation to the 
3.6 kb EcoRI and 10.1 kb Apal A DNA fragments*. Native TerB residues are 


lagging (<DnaB) 


highlighted in yellow except C6 that is in red. Non-native (modified) residues in 
TerB are highlighted in grey. Native TerH residues are highlighted in orange. 
Leading and lagging DNA strands as well as permissive (P) and non-permissive 
(NP) faces of Ter when bound to Tus are denoted. Directionality of 
translocation of DnaB that encircles the lagging strand as it unwinds dsDNA 
during leading strand DNA synthesis by Pol III holoenzyme is denoted by 
arrows. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Stoppage at NPTerB 


DNA synthesis (kbp) 
Oo na fF wWwNnN | OO 


DNA synthesis (kbp) 
oak wns OC 


LEeGryart tt? er tee 


0 400 800 
Time (s) 


1,200 1,600 460 500 540 


Time (s) 


895 bp s-1 


Oak wna 0 
On wn ad 


0 20 40 60 80100 120 140 


680 bp s-1 


Oa nhwn sao 


TTT 7 TTT TTT TTT 
Bor 0 OF 4 ; 
2 1 
Zi 1 1 1 
2 2 720 bp s-1 2 615 bp s-1 2b 1,050 bp s-1 2 
oO 
$3 3 3h 3 
©. pmaaaiaa cies fF: 
a4 4 4 4 
<x te 4 
ze 5 of 5 
O¢ 6 6 6 
a 
0 20 40 60 80 100 0 40 _ 80 120 160 0 20 40 60 80 100 120 
Time (s) Time (s) Time (s) 
b Bypass at NPTerB 
rr b © - 2. ot |p Fa ea (ae a a | a a a a a 
~ 0 | 0 4 
= 4 2 4 
pe Pe Nee ale ~ pease ieee: Pe ueatvall eaeaneelnente j 
2, ar 6 ‘ q 
= 1,400 bp s+? 1,785 bp s-1 4 2,410 bp s-1 6 1,805bps-+ J 
i<j * , 
Bel 6 4 8b 4 
xf } 10 8 4 
> 8 4 J 
Ast | ee 10 a 
r 10 4 
ae a coaeen Cee eee! ee eee) eee 14 | Kreme rel VE ed Ls Pout a 
0 20 40 60 80 100 120 0 20 40 60 0 20 40 60 80 100120 140 Oo 20 40 60 80 100 120 
| Ce a (a Gee a a 1 T T 
i. 0 OF 4 0) 
L 2: ap q 7 
7 a Perrereresstensserenenerenssaspanserenenanenesessnesail [ 4 
[ 7 2r 7 4 
- a 2,240 bp s-1 3h 1,290 bp s-1 q 1,780 bp s-1 
bssescu ices 6 
10 ar 
12 5 8 
14 6r 10 
re ro er Pititititititititit a 
0 20 40 60 80 100 0 20 40 60 80 0) 40 80 120 160 0 20 40 60 80 100 120 
Time (s) Time (s) Time (s) Time (s) 


c Fate of forks at CG(6)-NPTerB 


DNA synthesis (kbp) 
@ Oo . ny Oo 


o 


DNA synthesis (kbp) 
3 @ oO . i) Oo 


~ 


1 \ 
100 150 
Time (s) 


; \ \ 
700 150 200 0 50 


Time (s) 


0 50 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 2 | Examples of trajectories for leading-strand 
synthesis upon encountering Tus bound to non-permissive Ter sites. The 
location of the TerB site at 3.6 + 0.1 kb is indicated by the dashed lines. The 
rates of leading strand synthesis were calculated by fitting the slopes of the 
trajectories by linear regression using a least-squares approach. The replisomes 
displayed heterogeneity in rates of DNA synthesis. a, Trajectories where forks 
stopped at the NPTerB site. The average stoppage time captured within our 
acquisition time was 9.7 + 1 min (uncertainty is the standard error) as 
illustrated for the top trajectory. b, Trajectories where forks displaced Tus and 


LETTER 


bypassed the NPTerB site without displaying any transient stoppage. c, Fate of 
the replication fork upon encountering CG(6)-NPTerB. Examples of 
trajectories for leading-strand synthesis upon encountering Tus bound to 
CG(6)-NPTerB showing transient stoppage at CG(6)-NPTerB, followed by 
resumption of DNA synthesis; 56% of the restarted events displayed DNA 
synthesis with disrupted behaviour (top row) while 44% showed normal 
behaviour (bottom row). We attributed the disrupted restart of DNA synthesis 
in some of the trajectories to the replisome losing some components other than 
DnaB during stalling. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


TerB 


f 
a 
g 

T 


ion 


wo 
6 
T 


Processivity = 
5.5 + 1.2 kb 
(N= 88) 


ee) 
ao a 
T 


Number of events 
sa 


ou 
T 


6 8 
Total synthesis (kb) 


-Tus 


Rate = 
930 + 70 bp s-1 
(N= 94) 


Number of events 
a a 


0 500 1,000 1,500 2,000 2,500 
Event rate (bp s-*) 


Extended Data Figure 3 | Effect of TerB site alone and nonspecifically DNA- 
bound Tus on DNA synthesis. a, Probability of termination of DNA synthesis 
at 0.2 kb intervals (spatial resolution of the assay) along the 13.7 kb NPTerB in 
the absence of Tus, showing stops at TerB (3.5-3.7 kb, denoted by black arrow) 
occur randomly with a 3% probability when all events were considered, in 
contrast to 5% when only events that reached TerB (=3.5 kb) were taken into 
account. b, Processivity of DNA synthesis on the NPTerB substrate in the 
absence of Tus. The processivity distribution is fit with an exponential decay 
(N = 88) and uncertainty corresponds to the standard error, illustrating the 


@ 0.08 F 

s 0.07 

§ 0.06 

© 0.05 F Fa 

2 0.03 L 

CE lal bal ll 1d! 

°° MMI OO I Bim OU ooo oa lo. 9 Oo 
0 2 4 10 12 14 


0 2 6 8 10 


4 12 
Processivity (kb) 


+Tus 

185 Rate = 
B16 F 990 + 70 bp s-1 
Cc N=69 
o" ( ) 
o 12 
‘5 10 
5 8 
= 6 
5 4 
Z2 

of 


a 
0 500 1,000 1,500 2,000 2,500 
Event rate (bp s-‘) 


random stoppage behaviour of the replisome during synthesis. c, Rate of 
leading strand synthesis using the 13.7 kb force-calibrated DNA construct 
(NPTerB in this case) in the absence (left panel; N = 94) or presence of Tus 
(right panel; N = 69). The rate distributions were fit with a Gaussian 
distribution. The fit lines are shown and the uncertainties correspond to the 
standard error. The rate agrees with our previously reported rate using force- 
calibrated 4 DNA constructs", demonstrating the accurate force calibration of 
the 13.7 kb substrate. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a NPTerB 
Bypass 
(N = 32) 
Rates averaged over whole trajectories 
Stop/restart 5 8 Sees © wo 
(N = 31) 
0 500 1,000 1,500 2,000 2,500 3,000 Siapirastart B 
4 op/restai ypass 
b Rate (bp =) ) Rate = Rate = 
860 + 40 bp s-1 1,810 + 100 bp s~1 
wp 10F N= 31) (N= 32) 
Bypass er 
(N = 32) ger 
ab 
Rates averaged over 1.5 s before the Ter site 6 oF 
gf 
Stop/restart = 0 0 oo oo oo E 2b 
(N = 31) Ze 
oL 
pa 
0 500 1,000 1,500 2,000 2,500 3,000 04,000 2,000 3,000 
Rate (bp s) Event rate (bp s-‘) 
c d e 
Average rate = 800 bp s-1 
(r= 0.81) 0 ave" 695 bps-t Std. Dev. = 260 bp s-1 
60 ia . — % of fluctuation = 32% 60 if (r=—0.18) 
s c . _ 2 P 765 bp s-1 455 bp s-1 5 [ . : 
@ OF " = 730 bp s~1 @ Or . 
2 L = : n Sac = 
B 40- " . 7) o40F- . s . 
—] r =.” = © 4 = >. . id . 
305 af = i= 1,010 bp s =30L Be ee “ 
20 7 = = 4 20 [ 2" 8 
eer 1. a6 1,380 bp s~1 © 207 e 
S 10+ s i - ‘O10f 
oO r inear regression 3 oO Fb 
= oF 7 8 slope = 795 bp s-1 ee0lbpie* en eer = OF 
re oe oe eee es ee ae) Ve eed Co Se a | r? = 0.988 . a ec 
100 150 200 250 300 350 400 0 500 1,000 4,500 2,000 
Brownian motion of DNA (bp) 2 Oe eee Nein Oe oe Rate (bp s~) 


Extended Data Figure 4 | Linear fitting of the rate of leading-strand 
synthesis is appropriate for deriving the correlation between rate of DNA 
synthesis and stalling activity at the NPTerB site. a, Rate dependence of fork 
arrest at NPTerB. A scatter plot of forks that stopped (N = 31) or bypassed 
(N = 32) Tus bound to NPTerB; rates were calculated by fitting the DNA 
shortening phase of the entire trajectory in cases of events that bypassed and 
up to the stoppage point in events that stopped/restarted (histograms are shown 
in Fig. 1g). A significant correlation between fork progression rate and fork 
bypass at NPTerB is observed using a one sided Pearson’s correlation test at 
the 0.05 level of significance (the calculated correlation coefficient (r) was 0.62). 
The Pearson’s correlation coefficient was calculated using the equation 


din %i—X)(Vi—J) 


r= . b, Scatter plot (left) and rate 
VidhiG@i-97] [0101-7 
distributions (right) of leading-strand synthesis for events that bypassed (grey 
bars) (N = 32) or stopped/restarted (blue bars) (N = 31) at NPTerB when 
the rate was estimated from fitting the slope of the three data points before the 
TerB site (acquisition time is 0.5 s per data point). The rates were fit with a 
Gaussian distribution and uncertainty corresponds to the standard error. The 
calculated average rates for events that bypassed or stopped/restarted at 
NPTerB are similar to those calculated when the rates were fit using the DNA 
shortening phase of the entire trajectory in cases of events that bypassed and 
up to the stoppage point in events that stopped/restarted (shown in Fig. 1g 
in the main text), underscoring the suitability of linear fitting of the rate. 


Time (s) 


Furthermore 7” from linear regression fits was 0.95 + 0.05. ¢, The correlation 
between apparent fluctuation in rate of DNA synthesis within individual DNA 
molecules and their corresponding Brownian motion (N = 23). d, The individual 
trajectories displayed apparent fluctuation in rate of DNA synthesis as illustrated 
in a representative trajectory where we zoomed in at the DNA shortening phase 
and fit the rate linearly to intervals of three consecutive data points. The 
percentage of apparent fluctuation in rate of DNA synthesis within individual 
DNA molecules was calculated by dividing the standard deviation of the average 
of interval rates over the average rate. The standard deviation of the average 

of Brownian motion of each individual DNA molecule was calculated from the 
fluctuation of the DNA before and after being replicated. The percentage of 
apparent fluctuation in rate of individual DNA molecules displayed a strong 
positive correlation with their corresponding Brownian motion when analysed 
by two-sided Pearson’s correlation test at the 0.05 level of significance (r = 0.81, 
panel c). e, The correlation between the percentage of apparent fluctuation in 
rate and the average rate of individual molecules. The percentage of apparent 
fluctuation in rate of individual molecules was calculated as described in d 

and for the same 23 replisomes. There was no correlation between the average 
rate of individual DNA molecules and their corresponding percentage of 
apparent fluctuation in rate; the Pearson’s correlation coefficient was -0.18. 
The results from c-e demonstrate that one strong factor behind the apparent 
fluctuation in rates within our individual 13.7 kb molecules under our spatial 
and temporal resolution is the Brownian motion of the DNA and that this 
apparent fluctuation in rate does not bias the estimates of speed of the replisomes. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


wt forked Ter 


ava 
ekniidwmnr ans 
3° ATCAATGTTGTATEG 37 


‘TL 
5° WIAGTTACAACATA 


3’ JITCAATGTTGTATAGT 3° 


Extended Data Figure 5 | Crystal structures of Tus complexes with Ter 
oligonucleotides. The sequences of oligonucleotides used for each complex are 
shown at the bottom of each panel; nucleotides for which electron density 
could not be interpreted are highlighted. a~-d, Complexes of Tus proteins with 
forked Ter sites. The C6-binding pocket is shown in the circle, with key 
residues Ile79, Phe140 and His144 in the binding pocket, and Arg198 shown in 
stick form. a, The wild-type Tus—Ter lock (PDB code: 2106), with C6 located in 
the binding pocket, and the TA(7) base pair melted. Arg198 is positioned to 
interact with the 5'-phosphate of T7. b, Complex of wild-type Tus with a forked 
oligonucleotide that has C6 substituted by a mispaired G (UGLT: upper G, 
lower T; PDB code: 4XR0); G6 does not occupy the pocket nor does it make any 
new specific interactions with Tus, and Arg198 no longer interacts with the 
5'-phosphate of T7. c, Further extension of the mismatched region in b to 
include A7 (TGTA: mispaired TGTA on the lower strand; PDB code: 4XR1) 
does not enable G6 to occupy the C6-binding pocket or form any new specific 
interactions. d, Tus(H144A) in complex with the normal Tus-Ter lock 


a WTAGTTACAACATA\ 
AATCAATGTTGTATTABY 3" 


5’ BTAGTTACAACATACT 
IATCAATGTTGTATGAT a 


‘TL TL 
r 5° ‘@TAGTTACAACATAS 


ATCAATGTTGTAATEH 


5 HTACPTACAACATAGT 


IAATCAATGTTGTATCAT 


oligonucleotide (PDB code: 4XR2), showing the mispaired C6 does not occupy 
the cytosine-binding pocket or form any new interactions with Tus. 

e, f, Potential interactions of Arg198 in crystal structures of Tus complexes with 
fully base-paired Ter oligonucleotides. Only nucleotides in base pairs 5 and 6 
are shown, and they are colour-coded to match the stick representations of 
them in the figures. Arg198 is shown in yellow stick representation. e, Structure 
of the wild-type Tus-TerA (GC(6)) complex (PDB code: 2105). Arg198 is 
positioned potentially to make H-bonding interactions with the A5, G6 and T5 
bases and the deoxyribose ring oxygen of G6, as well as electrostatic interactions 
with the 5'-phosphate of A5, as suggested previously'’ and demonstrated by 
molecular dynamics simulations (A.J.O., unpublished observations). 

f, Structure of the complex with a GC(6)-flipped version of the TerA 
oligonucleotide (UGLC: upper G, lower C; PDB code: 4XR3) showing an 
alternate major conformation of the Arg198 side-chain that has lost all base- 
specific interactions; only the interaction with the sugar ring oxygen of the 
substituted C6 is maintained. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


4 5.mismatch G6-NPTerB 
= = a ‘ec’ ti 1 7 Fa 
~? ae Pause duration = 
2 2) 7P 177420s 
S2 eel (N= 20) 
a g 
a BSE 
g¢ Bap 
r= O3b 
ae 2 
< al 
8 = aL [| 
8 
a | oF 
| ae a ce ey Ee eee, SET Poy dd 
0 100 200 300 0 50 100 150 200 250 300 
Time (s) Pause time (s) 


Extended Data Figure 6 | Fate of the replication fork upon encountering 
5-mismatch G(6)-NPTerB. a, Examples of trajectories of replication forks that 
transiently stopped at Tus bound to the bubble template with C6 switched to G6 
(5-mismatch G6-NPTerB). b, The distribution of the pause durations fit with a 
single exponential decay. The fit line is shown in black and the uncertainty 
corresponds to the standard error (N = 20). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a TA(5)-NP TerB b swapped F4n GC(6)-NPTerB c swapped F5n GC(6)-NPTerB 
Stop/restart Bypass oF Stop/restart Bypass Stop/restart Bypass 
a 6 Rate = Rate = al Rate = Rate = o7F Rate = Rate = 
= 5 |. 360240 bps 1,320 + 60 bp s-1 £2 860 + 90 bp s-1 1,550+ 110 bps-1 = 310 + 70 bp s-1 1,260 + 60 bp s-1 
oO (NE 8) (N= 17) S 7P (w=12) (N= 17) © S&F (N=9) (N= 27) 
2 > 6h ast 
o4 fs) ® 
O, 5 OP o4b 
. oo 4b pS 
oO oO ® 3F 
O22 a 3r a 
£ E 2b £27 
3 3 3 
. at a L 
0 OF Ook 
) 1,000 2,000 3,000 0) 1,000 2,000 3,000 0 1,000 2,000 
Event rate (bp s~*) Event rate (bp s~‘) Event rate (bp s~*) 
d TA(6)-NP TerB e TA(6)-NP TerB f TA(6)-NP TerB 
Stop/restart Bypass 
= Pause duration = 8 - Rate = Rate = 
7 ger 2442s 2 7 | 360440 bps 1,310 + 50 bp s-1 
<= Gal (N=8) Gg L N=") (N= 30) 
7) > > 
O ® Os5L 
2 S 3 a) 4k 
€ Bob ® 3 
> a a" | 
€ E2- 
<x Sk 5 
Zz 2 Zo 
a Ook o- 
EEE ESS a a a 
0 50 100 150 200 250 300 0 20 40 60 0 1,000 2,000 
Time (s) Pause time (s) Event rate (bp s~) 
g NPTerH h NPTerH 
T a a a ee 
0 Stop/restart Bypass 
ae oe 7 Rate = Rate = 
g2 < 350 + 20 bp s-1 1,300 + 60 bp s-1 
— 5 6 (N= 11) (N= 26) 
wo 4 a5 
2 oO 
oe O4 
E83 3 
> 
a £2 
<x 10 =| 
Zz 1 
Q 12 2 F 


0) 


Extended Data Figure 7 | Fate of the replication fork upon encountering 
NPTerB sites with swapped sequences in the first five base pairs, TA(6)- 
NPTerB and NPTerH. Rate dependence of replication fork arrest at Tus 
bound to: a, TA(5)-NPTerB (N = 25); b, swapped F4n GC(6)-NPTerB 

(N = 29); c, swapped F5n GC(6)-NPTerB (N = 36). The rate distributions of 
leading-strand synthesis for events that bypassed (grey bars) or stopped/ 
restarted (blue bars) at these sequences. d, Examples of trajectories of leading- 
strand synthesis that transiently stopped at Tus bound to TA(6)-NPTerB. 75% 
of the restarted events displayed DNA synthesis of normal behaviour (left 
traces) while 25% showed disrupted behaviour (right trace). e, The distribution 


100 200 300 400 500 600 
Time (s) 


1 
0 


1 
1,000 


n 
2,000 


Event rate (bp s-‘) 


standard error. 


of the pause durations at TA(6)-NPTerB fit with a single exponential decay 
(N = 8). f, The rate distribution of events that bypassed (N = 30; grey bars) or 
stopped/restarted (N = 11; blue bars) at TA(6)-NPTerB. g, Examples of 
trajectories of leading-strand synthesis that transiently stopped at Tus bound to 
NPTerH. The average pause duration was 180 + 26s (N = 4). The uncertainty 
is the standard error. h, The rate distribution of leading-strand synthesis for 
events that bypassed (N = 26; grey bars) or stopped/restarted (N = 11; blue 
bars) at NPTerH. The histograms in a-c, e, f and h were fit to Gaussian 
distributions, the fit lines are shown, and the uncertainties correspond to the 


©2015 Macmillan Publishers Limited. All rights reserved 


a b 
200 Tus wt-dsTerB 

> 400 > 
z z 
8 300 3 
6 6 
® 200 500 1,000 1,500 2,000 a 
& Time (s) fd 

100 
RoR evar maranree 
0 1 1 
1,000 1,500 
Time (s) 

c d 
2 Tus(R198A)-—ds TerB g 
5 1 1 1 1 fo} 
3 320 640 960 1,280 a 
@ Concentration (nM) wv 

1 1 1 
0 500 1,000 1,500 2,000 
Time (s) 
e f 
500 

=> 400 > 
x x 
8 300 Tus(H144A)-ds TerB 2 
5 1 n 1 n 1 5 
g 200 : Concentration amy se g 
ng a 

100 
0 1 1 i 
0 500 1,000 1,500 2,000 

g Time (s) 

. [Tus] Kp 
Panel Interaction (nM) (nM) 

a Tus—ds TerB 0.25-8 1.86 + 0.08 
a(inset) Tus—dsTerB (simulated) 0.25-8 1.86 
b Tus-—forked TerB 0.25-4 0.212 + 0.002 
c Tus(R198A)—ds TerB 20-1 ,280 441+10 
c (inset) 20-1 ,280 780 + 60 
d Tus(R198A)-forked TerB 2.5—40 1.80 + 0.01 
e Tus(H144A)—ds TerB 2.5-320 39.3 + 1.0 
e (inset) 2.5-320 49.5+42.3 
f Tus(H144A)-forked7TerB 20-1,280 598 + 8 
f (inset) 20-1 ,280 805 + 70 


LETTER 


Tus wt-forked TerB 


Tus(R198A)-forked TerB 


500 


aS 
fo} 


No w 
So 


ka 
(M's) 


(3.78 + 0.08) x 10° 
3.78 x 10° 


(2.89 + 0.00) x 10° 


(2.99 + 0.00) x 104 


(5.52 + 0.00) x 104 


(3.92 + 0.02) x 10° 


(9.09 + 0.07) x 104 


©2015 Macmillan Publishers Limited. All rights reserved 


Tus(H144A)-forked TerB 


320 640 960 
Concentration (nM) 


1,280 


1,000 1,500 2,000 
Time (s) 
kg tia 
(s) (s) 
(7.04 + 0.15) x 10° 98 
7.04 x 10° 98 


(6.12+0.07)x10° 11,320 


(1.32 + 0.03) x 107 53 
(9.95 +0.03)x 10° 6,970 
(1.54 + 0.03) x 10 45 
(5.44 + 0.03) x 10 13 


LETTER 


Extended Data Figure 8 | SPR assessment of Tus- TerB interactions: 
whereas Tus and Tus(R198A) are capable of forming a lock, Tus(H144A) is 
not. ProteOn sensorgrams show association and dissociation phases of 
Tus—TerB interactions at ranges of Tus concentrations (as specified in g) of 
serially-diluted samples of Tus proteins. Curves, shown in colours, were fit 
simultaneously (black curves) to various binding models (see Methods). 

a, Wild-type Tus and dsTerB. Considering that the k, >1 X 10°M™'s 
suggests significant mass transport limitations, the LMT model was used to fit 
the data with Ry, constrained to 700 RU. The derived kinetic parameters were 
used to simulate sensorgrams devoid of mass transfer limitation using the L 
model (inset). b, Wild-type Tus—forked TerB interaction; Rinax Was 
constrained to 775 RU. The fit kg is in good agreement with the value of (5.20 + 
0.00) X 10~-°s~' obtained from an independent experiment where dissociation 
was monitored over 50,000 s (not shown). c, Tus(R198A)—dsTerB interaction. 
Binding kinetics parameters were obtained using the HLPR model. The sum 
of fit Ryrax1 (543 = 9) and Ryrax2 (54 + 5 RU) values were in reasonable 
agreement with the expected value of ~700 RU. Only the relevant k, and kg 
values of the predominant (based on R,yax1) interaction are presented in g. 
For assessment of the fitting procedure, responses at equilibrium were fit using 
the L model (inset). The derived Kp was within the factor of two of the 
calculated Kp obtained from kinetic parameters (kg/k,). The Rmax Value of 
816 + 32 RU was slightly higher than theoretical (700 RUs), probably owing 
to some non-specific binding in the high range of Tus concentration. d, 
Tus(R198A)—forked TerB interaction. The L model was used to fit the 


data with R,,,, constrained to 775 RU. The fit kg was within a factor of 

two of the value, (5.70 + 0.00) X 10 °s~', derived from an independent 
experiment where dissociation was monitored over 50,000 s (not shown). e, 
Tus(H144A)—dsTerB interaction. Binding kinetic parameters were obtained 
using the HLPR model. The sum of fit Rmmaxi (537 = 1) and Rmax2 (31 + 0 RU) 
values were in reasonable agreement with the expected value of ~700 RU. Only 
the relevant k, and kg values of the predominant interaction (Rmaxi) are 
presented in g. For assessment of the fitting procedure, responses at equilibrium 
were fit using the L model (inset). The derived Kp was within a factor of 

1.5 of Kp obtained from the kinetic parameters. In addition, the fit R,,a, value 
of 621 + 10 RU compares reasonably to the expected value of ~700 RU. f, 
Tus(H144A)—forked TerB interaction. Binding kinetics parameters were 
obtained using the HLPR model. The sum of fit Rinaxi (879 + 4) and Rinax2 
(65 + 1) values were somewhat high compared to the expected value of 
~775 RU. Only the relevant k, and kg values of the predominant reaction are 
presented in g. Responses at equilibrium were fit using the L model (inset). 
Derived Kp was within the factor of 2 of the calculated Kp obtained from 
(kg/k,). In addition, fit Rmax Value of 1,040 + 50 RU was slightly higher than 
theoretical. g, Summary of binding parameters for Tus-Ter interactions. 

All uncertainties are standard errors in parameters from fitting of complete 
data sets to appropriate binding models as described in the Methods. Data 
are representative of those from two technical replicates using different 
instruments (BiaCore T200 and ProteOn XPR-36). 


©2015 Macmillan Publishers Limited. All rights reserved 


Tus(H144A) / NPTerB 


_ Stop/restart Bypass 
Rate = Rate = 
- 740+ 10 bps 1,520 + 120 bp s-1 


(N= 15) (N= 18) 


Number of events 
oeFt NUN WHEAT DN OW 
T 


1 FS CS Vn [Se ee anew Vans Se 
0 1,000 2,000 
Event rate (bp s-*) 


Extended Data Figure 9 | Fate of the replication fork upon encountering 
Tus(H144A) bound to NPTerB. Rate dependence of fork arrest. The rate 
distribution of leading-strand synthesis for events that bypassed (N = 18; grey 
bars) or stopped/restarted (N = 15; blue bars) at NPTerB fit with Gaussian 
distributions. The fit lines are shown and the uncertainties correspond to the 
standard error. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


LETTER 


Extended Data Table 1 


Data collection 
Space group 
Cell dimensions 

a, b, c (A) 

a, B, y (*) 
Resolution (A) 
Rsym 
Hol 
Completeness (%) 
Redundancy 


Refinement 
Resolution (A) 
No. reflections 
Ruork! Riree 
No. atoms 
Protein 
Nucleic acid 
Ligand/ion 
Water 
B-factors 
Protein 
Nucleic acid 
Ligand/ion 
Water 
R.m.s deviations 


Bond lengths (A) 
Bond angles (°) 


Asingle crystal was used in each case. 


wt Tus / forked Ter-UGLT 
[C6 to G mutant] 


P4,2,2 


64.5, 64.5, 248.3 
90, 90, 90 

75-2.80 (2.90-2.80) 
11.1 (74.0) 

39.5 (6.6) 

99.8 (99.6) 

22.1 (18.9) 


45-2.80 (2.86-2.80) 
12,822 (901) 
21.7 129.6 


2,504 
577 
13 

31 


25.3 
26.5 
28.7 
35.7 


0.016 
2.16 


Numbers in parentheses refer to the highest resolution bin. 


wt Tus / forked Ter-TGTA 
[C6 to G, T7 to A mutant] 


P4,2,2 


64.8, 64.8, 246.7 
90, 90, 90 

75-2.40 (2.50-2.40) 
12.0 (71.1) 

24.0 (3.3) 

99.9 (100) 

9.7 (10.8) 


63-2.40 (2.46-2.40) 
20,288 (1,521) 
19.8 / 26.9 


2,530 
574 
14 
96 


43.3 
45.1 
50.3 
57.2 


0.017 
2.13 


Data collection and refinement statistics for Tus-Ter complexes 


Tus(H144A) / wt forked Ter 


P4,2,2 


64.5, 64.5, 250.9 
90, 90, 90 

75-2.35 (2.43-2.35) 
8.0 (85.6) 

26.7 (2.0) 

99.6 (100) 

6.6 (7.5) 


62-2.35 (2.41-2.35) 
21,819 (1,658) 
22.0 / 26.6 


2,509 
592 
14 
69 


39.3 
38.4 
58.1 
48.1 


0.013 
1.71 


©2015 Macmillan Publishers Limited. All rights reserved 


wt Tus / dsTer-UGLC 
[GC(6) to CG(6) mutant] 


P4,2,2 


64.1, 64.1, 249.3 
90, 90, 90 

75-2.70 (2.80-2.70) 
10.6 (84.1) 

27.2 (3.6) 

99.8 (100) 

13.6 (13.4) 


62-2.70 (2.77-2.70) 
14,304 (1,086) 
20.7 / 28.4 


2,503 
612 
12 
32 


61.3 
57.4 
61.4 
61.3 


0.013 
1.97 


Mae A eae 


doi:10.1038/nature14906 


Integrator mediates the biogenesis of enhancer RNAs 


Fan Lai!*, Alessandro Gardini!*, Anda Zhang" & Ramin Shiekhattar! 


Integrator is a multi-subunit complex stably associated with 
the carboxy-terminal domain (CTD) of RNA polymerase II 
(RNAPII)'. Integrator is endowed with a core catalytic RNA endo- 
nuclease activity, which is required for the 3’-end processing 
of non-polyadenylated, RNAPII-dependent, uridylate-rich, small 
nuclear RNA genes’. Here we examine the requirement of Integrator 
in the biogenesis of transcripts derived from distal regulatory 
elements (enhancers) involved in tissue- and temporal-specific regu- 
lation of gene expression in metazoans”>. Integrator is recruited to 
enhancers and super-enhancers in a stimulus-dependent manner. 
Functional depletion of Integrator subunits diminishes the signal- 
dependent induction of enhancer RNAs (eRNAs) and abrogates 
stimulus-induced enhancer-promoter chromatin looping. Global 
nuclear run-on and RNAPII profiling reveals a role for Integrator 
in 3’-end cleavage of eRNA primary transcripts leading to transcrip- 
tional termination. In the absence of Integrator, eRNAs remain 
bound to RNAPII and their primary transcripts accumulate. 
Notably, the induction of eRNAs and gene expression responsive- 
ness requires the catalytic activity of Integrator complex. We 
propose a role for Integrator in biogenesis of eRNAs and enhancer 
function in metazoans. 

To assess the role for Integrator in the biogenesis of eRNAs, we 
examined the signal-dependent recruitment of Integrator complex to 
enhancer sites. HeLa cells were starved of serum for 48 h, after which 
they were stimulated with epidermal growth factor (EGF) to induce 
immediate early genes (IEGs). We identified 2,029 enhancers based on 
their occupancy by RNAPII, CBP/p300 and containing acetylated his- 
tone H3 lysine 27 (H3K27ac) chromatin modification (see Methods). 
We found that while assessing steady-state levels of eRNAs provided 
a measure of EGF-induced eRNAs, we obtained a better read-out of 
eRNAs after sequencing of the chromatin-enriched RNA fractions 
(ChromRNA-seq)°. We focused on 91 enhancers that displayed EGF- 
induced eRNAs in the proximity of EGF-responsive genes following 
20min of induction (Extended Data Fig. 1, Supplementary Table 1 
and see Methods). Notably, the chromatin surrounding these enhancers 
displayed the H3K27ac modification in starved cells, and following EGF 
stimulation there was a small increase in H3K27ac levels (Extended Data 
Fig. 1b). To assess the polyadenylation state of eRNAs, total RNA was 
enriched for polyadenylated and non-polyadenylated fractions and was 
subjected to high-throughput sequencing. Similar to previous reports, 
EGF-induced enhancers displayed bi-directional eRNAs that were pre- 
dominantly not polyadenylated (Extended Data Fig. 2)°”. 

We next analysed Integrator occupancy at these enhancers by using 
antibodies against the INTS11 subunit of the Integrator complex 
before and after EGF stimulation. While these enhancers were occu- 
pied by a detectable amount of Integrator before EGF induction, addi- 
tion of EGF resulted in a further recruitment of Integrator complex 
(Fig. la-c). RNAPII displayed a similar pattern of stimulus-dependent 
chromatin residence (Fig. 1d, e). The stimulus-dependent recruitment 
of Integrator at enhancers was further confirmed using two additional 
antibodies against INTS1 and INTS9 subunits of the Integrator 
complex (Extended Data Fig. 3a). These results demonstrated the 


stimulus-dependent recruitment of the Integrator complex at EGF- 
responsive enhancers. 

To examine the functional importance of Integrator at enhancers 
and its role in the biogenesis of eRNAs, we developed HeLa clones 
expressing doxycycline-inducible short hairpin RNAs (shRNAs) 
against INTS11 and INTS1 subunits of the Integrator complex 
(Extended Data Fig. 3b). Within the time course of these experiments 
the mature levels of small nuclear RNAs (snRNAs) were not perturbed 
(data not shown). Twenty minutes of EGF stimulation resulted in the 
induction of bi-directional eRNAs similar to previous reports (Fig. 1a, f 
and Extended Data Fig. 1c-h)°*"*. Depletion of INTS11 diminished 
the eRNA induction after EGF stimulation (Fig. 1f; as shown at two 
enhancer loci; enhancers were named after their proximity to an EGF- 
responsive gene). The fold induction of eRNAs at all EGF-induced 
enhancers decreased significantly (Fig. 1g, h). We also observed a 
significant decrease in the transcriptional induction of EGF-responsive 
protein-coding genes in the proximity of these EGF-induced enhan- 
cers (Fig. 1g, h). Interestingly, there was a subtle increase (statistically 
not significant) in H3K27 acetylation at enhancers following EGF 
stimulation, which was reduced after Integrator depletion (Fig. 1f 
and Extended Data Fig. 3c). 

To gain further insight into quantitative changes in eRNAs follow- 
ing depletion of Integrator, we depleted INTS11 or INTS1 and per- 
formed a time-course analysis of eRNA induction using specific 
primer sets for each strand. Depletion of either Integrator subunit 
diminished the EGF-induced increase in eRNA levels from both 
strands of the enhancers (Extended Data Fig. 4a, b). Analysis of 
regulatory landscape in the proximity of the EGF-responsive gene 
ATF3 (activating transcription factor 3) revealed the presence of 
clusters of acetylated H3K27 and p300 binding sites similar to that 
described for super-enhancers'*'* (Extended Data Fig. 4c). This 
region also displayed occupancy by RNAPII at multiple sites, and 
we could detect additional recruitment of RNAPII and Integrator 
to these sites following EGF stimulation (Extended Data Fig. 4c). 
Analysis of eRNA synthesis using strand-specific RNA-seq and 
real-time PCR (during a time-course experiment) demonstrated a 
requirement for Integrator in the induction of eRNAs at the super- 
enhancer sites after EGF stimulation (Extended Data Fig. 4d). 
Collectively, these results highlight a requirement for Integrator in 
stimulus-dependent induction of eRNAs from individual enhancers 
and enhancer clusters. 

An important component of enhancer function is the formation of 
stimulus-dependent chromatin looping, allowing enhancer and pro- 
moter communication’*"'’. We measured chromatin looping between 
NR4A1 and DUSP1 enhancers and their respective promoters using 
chromosome conformation capture (3C) following stimulation with 
EGF (Fig. 2a). We observed a robust association between the enhancer 
and the promoter regions of NR4A1 and DUSP1 after EGF stimulation 
(Fig. 2b). Remarkably, depletion of Integrator abrogated the EGF- 
induced chromatin looping without any effect on non-stimulus- 
induced chromosomal interactions (Fig. 2b, c and Extended Data 
Fig. 5a, b). These results demonstrate that Integrator regulates 


1University of Miami Miller School of Medicine, Sylvester Comprehensive Cancer Center, Department of Human Genetics, Biomedical Research Building, Room 719, 1501 NW 10th Avenue, Miami, Florida 


33136, USA. 
*These authors contributed equally to this work. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 399 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a NR4A7 enhancer b d 
INTS11 RNAPII 
‘aa kb—— EGE 07 
— Ctrl — Ctrl 
- 10 
peices cocci PS gs ae — Ctrl EGF 3 — Ctrl EGF 
197 s =i £ 
2 2 F: 
ne ek re] rol 
247 a 0.3 a 
= = 
a = (o) io) 
in, ty Mi aceasta i 
pe ow or 3 0.1 
33 2 4 kb TSS 4kb 4kb TSS 4kb 
24 Fe 
i + 8 2 eRNAs Protein-coding genes 2 eRNAs Protein-coding genes 
te i ee 5 60a! . t 500 = 2,000 5,000 
— co, Lie : 3 : H ‘ 3 . 7 
33l g 100 i =e : H ata i — 
® —_— 300] —— H oy , t 3,000 H 
3.7 5 50 == = ' H = 1,000 : : : H 
ey ane 2 8 7 i ai ! > & 500 : - 
G H H i 4 a 
3.77 Q E 20 H is oo == , E 7 1,000 —— > 
+]- io} — 0 —— —— i} 0 
a autte jo ore z Zz 
= Ctrl Ctrl EGF Ctrl Ctrl EGF Ctrl = Ctrl EGF Ctrl = Ctrl EGF 
f NR4A7 enhancer DUSP7 enhancer g 
10 kb A0 kb. NASseq 
EGF Dox EGF Dox 
3.9 —+ Sense a ee "1 — Sense ee ech eRNAs Protein-coding genes 
wae od eeeers 7 - 4 . i 
a fa im rh a o : a 
s « Antisense £2 3 H 2 
58L « Antisense 1.8L rg ' rs 3 
3.97 + = WF + = s H ¢ ‘ 
#2 = fal = 
H 
a ny Oe $86 eee ped a ' S ' Z 
_ =e r 1 5a S t 
7 2d 23, | =; | so 
58L Eg 18 gE Fo ' S 
3.97 _ & tr _ 3 = ee eas — 
| + 18 * |8 
Pal one oe pol ct Ctrl Dox Ctrl Dox 
= = “1 ” 
58L + + 1.8L + + 
3.97 i ip h ChromRNA-seq 
oe oll, bis. oH a 
Hamers |e ' " yr eRNAs Protein-coding genes 
58L 18 2 0t: SS =, 3.0 
486 - 4208p 4 3 ' S 
bl A g ' g 
L. | al @ 2.0 ' ray 
D & 2.0 
486 - o. < 208 ee 2 2 —— 
- - a oO ' 
bd A 3 h SB 5 = ' . 
o — 6 1 
re - Ss v1.0 H D109 H — 
“er | +2 2 | 7” se So 
- + = 3 : 
a ' ee ° lb mB ' & == 
486 - + + 208 - +o + w ae woo =i 
Ah. A 4 he d Ctrl Dox Ctrl Dox 


Figure 1 | Integrator mediates induction of eRNAs. a, EGF induction of an 
enhancer in the vicinity of the NR4A 1 gene (see Extended Data Fig. 1i). RNAPII 
and INTS11 are recruited to the enhancer after 20 min of stimulation and 
eRNAs are transcribed bi-directionally from the locus (as revealed by deep 
sequencing of chromatin-associated RNA, ChromRNA-seq). The y axis 
represents the read counts normalized to sequencing depth. b, Average profile 
of Integrator recruitment to 91 EGF-responsive enhancers. TSS indicates 
transcription start site. The y axis shows the average of read density. c, Increased 
Integrator occupancy at enhancers and their corresponding protein-coding 
genes (mean density was calculated as follows: 6 kb surrounding the peak of 
RNAPII for eRNAs; from —0.5 kb to +2.5 kb for coding genes; P < 0.001). 
Whiskers on the box plot indicate the variability in the datasets. d, Average 
profile of RNAPII upon EGF treatment at enhancers. e, Increased RNAPII 


enhancer function as reflected by the physical association between 
enhancers and their respective promoters. 

To gain an insight into the mechanism by which Integrator regu- 
lates enhancer function and eRNA biogenesis, we depleted Integrator 
and performed RNAPII profiling and global nuclear run-on followed 
by high-throughput sequencing (GRO-seq) after EGF induction. 
Notably, Integrator depletion resulted in the increase and spreading 
of GRO-seq reads throughout the body of eRNA transcripts at both 
enhancers and super-enhancers, which was mirrored by a concom- 
itant increase and spreading of RNAPII localization (Fig. 3a, b). 
Indeed, the average profile of depth-normalized reads of 91 EGF- 
induced enhancers showed a significant accumulation of GRO-seq 
and RNAPII ChIP-seq reads (Extended Data Fig. 6a, b). Analysis of 
RNAPII travelling ratio, a measure of RNAPII productive elongation, 
revealed that in contrast to EGF-responsive protein coding genes, 
which experience a block in productive elongation after Integrator 
depletion’*, there is increased RNAPII occupancy in the body of 
eRNA transcripts (Extended Data Fig. 6c, d). The accumulation of 


400 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


occupancy following EGF stimulation at enhancers and their corresponding 
protein-coding genes (P < 0.005). f, Inducible knockdown of INTS11 
(doxycycline (dox)) markedly reduces steady-state levels of eRNAs (as 
measured by total RNA-seq). Data were obtained using a tet-inducible shRNA 
system, stably transduced in HeLa cells. Acetylation of H3K27 is also shown. 
g, h, Average expression levels of 91 eRNAs and their neighbouring (<500 kb) 
57 protein-coding genes indicate a significant impairment of activation. Box 
plots represent the expression fold change (logy) before and after EGF 
treatment in normal conditions (ctrl) and upon depletion of Integrator (dox) 
(t-test, P <0.0005 for all panels). Fold change of RPKM (reads per kilobase of 
exon per million mapped reads) values was calculated from RNA-seq (f) and 
ChromRNA-seq (g) data. 


RNAPII at eRNA loci after Integrator depletion occurred despite the 
decreased recruitment of super elongation complex (SEC) to enhan- 
cers (Extended Data Fig. 7a, b). 

The increased RNAPII occupancy at eRNA loci suggests a block 
in 3’-end cleavage of primary eRNA transcripts, leading to a defect 
in termination. To quantitate such a 3’-end cleavage defect, we 
measured the accumulation of primary levels (or unprocessed levels) 
of eRNA transcripts after Integrator depletion using semi-quantitative 
PCR and real-time PCR. We observed a 3- to 10-fold accumulation of 
unprocessed eRNA transcripts concomitant with the reduction of 
the processed eRNA levels (Fig. 3c-e and Extended Data Fig. 8a). 
Previous experiments revealed that the loss of 3’-end cleavage by 
Integrator led to increased levels of polyadenylated U snRNA tran- 
scripts, which are normally not polyadenylated’’. Indeed, analysis of 
the polyadenylated transcripts revealed a robust increase in polyade- 
nylation of eRNAs in the absence of Integrator (Fig. 3f, g). These 
results attest to Integrator cleavage of the 3’ end of eRNAs leading to 
a termination of transcription. 


©2015 Macmillan Publishers Limited. All rights reserved 


a 20 kb 10 kb 
N2 N4 D2 D4 
Cont N1 N3 N5 N6 Ee Con2 Con D1 DS 
Ncol >| 
col [za TP PPh asi Laem] pt | sow [sow tk bY 
NR4A1 eRNA eRNA DUSP1 
b 2 
2p Ctrl 3 a 
5 — Ctrl+ EGF 5 " 
Paar a Ctrl+ EGF 
G © ® 2b 
© oO oOo 
EGE EG 
a: = 
£8 28 4} 
ae se 
Oo > oO 
ion RN oc 
°"Cont Ni NZ N3N4-N5NGE Gone o"con D1 D2 D3 D4 D5 
7 2r Bi 
5 shINTS11 § shINTS11 
3 8 —_ shINTS11+ EGF 38 2p - —_ shINTS11+ EGF 
ge ge 
Es 1 EG 
8 £3 iL 
oe oe 
o oO 
ion oc 
: ) 
°'Gont Ni N2 N3N4 NS NOE Con2 Con Di D2 D3 D4 DS 


We surmised that such a termination defect might result in the inab- 
ility of RNAPII to dissociate from the eRNAs, leading to accumulation of 
RNAPII-eRNA complexes and a consequent decrease in mature eRNA 
levels. We performed ultraviolet (UV) cross-linking followed by RNA 


a NR4A17 enhancer 


EGF Dox 
ie 


5 kb 


ee. » 3 


All MW) " J 
486 + A A H3K27ac 


ee ie 


Mii ah * 


GUSB mRNA (housekeeping) | eet 


c NR4A7 eRNA 
u 
A. 
- — ah. 
17 z 
ay 
d —Rep1— — Rep2— 
Ctrl Dox Ctrl Dox 
Unprocessed — — 
Total |e ~~ eo 
Unprocessed | sess (mas se Semmes 
Total] — 
e 
f 
CCNL1 enhancer 
m=" EGF Dox 
2.9 ae 
[ what. 
. + + 
lal alll aa 
2.6L 6.5) 


DUSP7 enhancer 
F Dox 


i A ae 


+ + 


a - 


rT 


9g 
NR4A7 enhancer 28 
: rf 
- EGF Dox 3 
4 o 6 
+ nol 
vhvdbdbve tbs et S 4 
pea = 2 
o 3 
< N 2 
+ + |b a 
Gikddha il bis Eo 
ce} 
r4 
-2 
EGF 
DOX 


A H3K27ac 


—3— 
CCNL1 eRNA 
u 
wb Or 
AWM 


— Repi— — Rep2— 
Ctrl Dox Ctrl Dox 


— Repi— — Rep2— 
Ctrl Dox Ctrl Dox 


©2015 Macmillan Publishers Limited. All rights reserved 


1 
251 


b DUSP5 enhancer 


DUSP7 eRNA 


4 JIT 


— Repi— — Rep2— 
Ctrl Dox Ctrl Dox 


he 

1 

4 
bas-OYD 


+ + 
—Rep1—  —Rep2— 


EGF Dox 


Id'VNY 


Sense 


Antisense 


LETTER 


Figure 2 | Integrator is required for enhancer- 


promoter interaction. a, Diagrams of NR4A1 


(left) and DUSP1 (right) genomic regions with 
their respective enhancers (shown in red). The 
arrowheads depict the position of primers for 
detection of chromatin looping and the stick bars 
indicate enzyme digestion sites (named N1-6 and 
D1-5). E refers to the anchor primer at the 
enhancer sites; control sites are also indicated. 

b, Looping events between the promoter region 
of NR4A1 and its enhancer were detected at N3, 
N4 and N5 sites after EGF induction (left). A 
similar interaction was also captured between 
sites D3 and D4 of DUSP1 promoter and its 
downstream enhancer after EGF induction 
(right). c, Knockdown of Integrator abolished 
chromosomal looping events at both NR4A1 and 
DUSP1 sites. The interaction frequency between 
the anchoring points and the distal fragments were 
determined by real-time PCR and normalized to 
BAC templates. All sites were assayed in three 
independent experiments (P < 0.01, two-sided 
t-test). Control anchors are displayed in Extended 
Data Fig. 5. 


immunoprecipitation (UV-RIP) using antibodies against RNAPII to 
examine increased association of eRNAs with RNAPII after depletion 
of Integrator. Consistent with a role for Integrator in the processing of 
eRNAs, depletion of Integrator led to a profound increase in eRNA 


Figure 3 | Integrator has a role in termination of 
eRNAs. a, b, RNAPII dynamics was analysed by 
ChIP-seq and GRO-seq at the enhancer regions 
adjacent to NR4A1 and DUSPI (a) and at the 
super-enhancer upstream of DUSP5 (b). The y axis 
represents the read counts normalized to 
sequencing depth. c, 3’-end cleavage of eRNAs was 
examined with semi-quantitative PCR. Primer 
pairs were designed to amplify a portion of the 
enhancer transcript as detected in the control 
GRO-seq experiment (t, total) or a longer template 
further extending into the 3’ of the enhancer 
region (u, unprocessed). d, PCR analysis was 
performed in two independent replicates, before 
(ctrl) and after (dox) depletion of INTS11 at three 
eRNAs (sense and antisense strand). e, The 
housekeeping gene GUSB was used as a cDNA 
loading control. f, Polyadenylation of RNAs 
increases after depletion of Integrator at DUSP1 
and CCNLI enhancer loci. The polyadenylated 
fraction of RNA from whole-cell lysates was 
sequenced after EGF stimulation, before and after 
depletion of INTS11 (dox). g, Box plot shows 
significant increase in polyadenylated RNA reads 
(P < 0.001) across the entire set of EGF responsive 
enhancers. Whiskers on the box plot indicate 

the variability in the datasets. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 401 


LETTER 


engagement with RNAPII following induction with EGF (Extended Data 
Fig. 8b-d). We found similar results after analysis of RNAPII interaction 
with the eRNAs at the A TF3 super-enhancer (Extended Data Fig. 8 e-g). 
Taken together, these results implicate the Integrator complex in the 
termination of eRNAs and highlight Integrator’s role in the release of 
eRNA transcripts from transcribing RNAPII. 

The catalytic subunit of Integrator is composed of the heterodimer 
of INTS11 and INTS9 enzymes with close homology to CPSF73 and 
CPSF100, respectively”. We previously showed that a single point 
mutation (E203Q) in the catalytic domain of INTS11 leads to impaired 
processing of small nuclear RNAs’. To assess the impact of INTS11 
enzymatic activity on eRNA biogenesis, we developed wild type and 
mutant INTS11 (E203Q) that would be refractory to the action of 


shRNAs against INTS11, and used these constructs to perform rescue 
experiments. While ectopic expression of wild-type INTS11 could 
substantially rescue the EGF-induced eRNA levels after depletion of 
INST11, the single-point catalytic mutant was without any effect 
(Fig. 4a and Extended Data Fig. 9a). Interestingly, we observed a sim- 
ilar rescue of the transcriptional activation of EGF-induced genes by 
the wild-type INTS11 and not its catalytic mutant (Fig. 4b). These 
results not only demonstrate the requirement of INTS11 catalytic 
activity in regulating the induction of eRNAs but also highlight the 
defect in eRNA processing as a contributing factor in the loss of tran- 
scriptional responsiveness. 

To determine the scope of Integrator function on active enhancers 
we analysed the 2,029 transcriptionally active enhancers in HeLa cells. 


Figure 4 | Integrator has a global role in 
enhancer regulation. a, Ectopic expression of 
wild-type INTS11, and not its catalytic mutant 
(E203Q), following Integrator depletion can rescue 
eRNA induction by EGF. b, A similar rescue was 
observed for wild-type INTS11 on the target 
protein-coding genes. Real-time PCR analysis was 


CCNL1 


te te 


a eRNA (CCNL1) b 

I3 Sense strand inhi 8, Antisense strand bo 

a “ee WT fa) 

o 6 (Dox o : 
2) 3 [@Dox+INTS11 iO) 

o EIDox+INTS11(E203Q) o 

2 4 2 
G 4 
s4 £ 

3 2 is) 
3 3 

lo} | ° 
* 0 0 = wo 

0 20 0 20 
Time (min) Time (min) 
eRNA (DUSP71) 

bo Sense strand Antisense strand wee pe 

a Q 3 
a 4 a 
<< <x 
g g 

3 3 o 2 
2 2 
i] 2 is 

5 5 1 
Z bs z 

fe) fe) 
im 0 u ol 


Time (min) 


7) 


performed on CCNL1 and DUSP1 eRNAs and their 
corresponding mRNAs before and after EGF 
stimulation. Each eRNA was assayed with two sets 
of primers. Error bars represent + s.e.m. (1 = 3 
biological independent experiments), **P < 0.01 
by two-sided t-test. c, The heat map showcases 
2,029 enhancer regions identified using RNAPII 
extragenic loci enriched in H3K27 acetylation (see 
Methods). Enhancers were centred at the middle of 
the RNAPII peak and ranked by transcription 
activity (GRO-seq). The distribution of p300 and 
H3K27ac are consistent with a group of active 


0 2 
Time (min) 


i 


DUSP1 


20 = 
Time (min) 


RNAPII ChIP 


H3K27ac 


Distance from 


RNAPII peak 10,000 bp per 400 bins 


Serum depletion 


EGF induction My 
*) ‘> \ 


EGF induction + 
Integrator RNAi 


402 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


enhancers. Upon Integrator depletion, nascent 
RNA reads and RNAPII profiles spread beyond the 
normal 3’end of eRNAs. d, Model for the role of 
Integrator at eRNAs. Stimulation of serum-starved 
cells with EGF triggers recruitment of RNAPII 
and Integrator to enhancer sites and induces 
bi-directional transcription of non-polyadenylated 
eRNAs. Upon EGF stimulation Integrator 
navigates the enhancers along with RNAPII to 
promote endonucleolytic cleavage of nascent 
transcripts, leading to release of the mature eRNAs. 
Depletion of Integrator elicits a cleavage defect 
leading to faulty termination, which results in 
extended eRNA transcripts and accumulation of 
RNAPII. 


+ 
+ 


2,029 total regions (3.2 per pixel row) 


p300 @ 
eRNA —— 
Nucleosome 
Integrator 
Poised RNAPII 
Activated RNAPII 


Wuvo 


©2015 Macmillan Publishers Limited. All rights reserved 


We ranked the enhancers based on their transcriptional activity, which 
mirrored that of RNAPII occupancy (Fig. 4c). Notably, depletion of 
Integrator resulted in processing defects at all active enhancers, as 
reflected by the broadening of GRO-seq and RNAPII ChIP-seq reads 
commensurate with the transcriptional activity of each enhancer site 
(Fig. 4c). This was in contrast to GRO-seq and RNAPII profiles at 
transcriptionally active protein-coding genes (Extended Data Fig. 9b). 
These results demonstrate the generality of Integrator in the proces- 
sing of eRNAs at enhancers (Fig. 4d). 

Recent genome-wide studies have revealed the presence of RNAPII 
at active enhancers coincident with expression of these regulatory 
elements as long non-coding RNAs*”’. Importantly, such eRNAs have 
been shown to have critical roles in transcriptional induction by a 
variety of signal transduction pathways’*'''*. We show that 
Integrator is the molecular machine that is recruited to enhancers in 
a signal-dependent manner and is required for the induction of 
eRNAs. We surmise that the defect in 3’-end processing following 
Integrator depletion leads to a termination defect reflected in increased 
levels of primary eRNA transcripts. It is also likely that Integrator 
affects the stability of the mature transcripts, since its depletion leads 
to changes in steady-state levels of mature eRNAs. 

Similar to other regulatory complexes, Integrator is also recruited to 
the promoters of protein-coding genes including IEGs'****. Interes- 
tingly, recent reports described an association between Integrator and 
transcriptional pause release factors, negative elongation factor 
(NELF) and SPT4-SPT5 complexes’*'*****, NELF was also reported 
to associate with eRNAs in neuronal cells”°®. Indeed, we found that 
Integrator depletion resulted in a defect in transcriptional initiation 
as well as pause release, which was reflected in the loss of responsive- 
ness of IEGs to EGF stimulation’*. However, depletion of NELF sub- 
units did not affect eRNA induction (Extended Data Fig. 7c, d). 
Moreover, Integrator depletion did not change NELF occupancy at 
EGF-induced enhancers (Extended Data Fig. 7e). Taken together, our 
results point to multiple functions for Integrator at protein-coding 
genes. While Integrator at promoters regulates pause release factors, 
leading to modulation of productive transcriptional elongation, 
Integrator at enhancers governs eRNA maturation and enhancer- 
promoter communication. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 14 March 2014; accepted 7 July 2015. 
Published online 26 August 2015. 


1. Baillat, D. et a/. Integrator, a multiprotein mediator of small nuclear RNA 
processing, associates with the C-terminal repeat of RNA polymerase Il. Ce// 123, 
265-276 (2005). 

2. Wang, K.C. etal. Along noncoding RNA maintains active chromatin to coordinate 
homeotic gene expression. Nature 472, 120-124 (2011). 

3. rom, U.A. etal. Long noncoding RNAs with enhancer-like function in human cells. 
Cell 143, 46-58 (2010). 


LETTER 


4. DeSanta,F. etal. Alarge fraction of extragenic RNA pol ll transcription sites overlap 
enhancers. PLoS Biol. 8, e1000384 (2010). 

5. Kim,T.K. etal. Widespread transcription at neuronal activity-regulated enhancers. 
Nature 465, 182-187 (2010). 

6. Bhatt, D. M. et al. Transcript dynamics of proinflammatory genes revealed by 
sequence analysis of subcellular RNA fractions. Ce// 150, 279-290 (2012). 

7. Lam, M.T. et al. Rev-Erbs repress macrophage gene expression by inhibiting 
enhancer-directed transcription. Nature 498, 511-515 (2013). 

8. Li, W. etal. Functional roles of enhancer RNAs for oestrogen-dependent 
transcriptional activation. Nature 498, 516-520 (2013). 

9. Wang, D. et al. Reprogramming transcription by distinct classes of enhancers 
functionally defined by eRNA. Nature 474, 390-394 (2011). 

10. Sigova, A. A. et al. Divergent transcription of long noncoding RNA/mRNA gene 
pairs in embryonic stem cells. Proc. Natl Acad. Sci. USA 110, 2876-2881 (2013). 

11. Hah, N. etal. A rapid, extensive, and transient transcriptional response to estrogen 
signaling in breast cancer cells. Cel/ 145, 622-634 (2011). 

12. Whyte, W. A. et al. Master transcription factors and mediator establish super- 
enhancers at key cell identity genes. Ce// 153, 307-319 (2013). 

13. Lovén, J. et al. Selective inhibition of tumor oncogenes by disruption of super- 
enhancers. Cel/ 153, 320-334 (2013). 

14. Hnisz, D. etal. Super-enhancers in the control of cell identity and disease. Ce// 155, 
934-947 (2013). 

15. Sanyal, A., Lajoie, B.R., Jain, G. & Dekker, J. The long-range interaction landscape of 
gene promoters. Nature 489, 109-113 (2012). 

16. Melo,C.A. etal. eRNAs are required for p53-dependent enhancer activity and gene 
transcription. Mol. Cell 49, 524-535 (2013). 

17. Mousavi, K. et al. RNAs promote transcription by establishing chromatin 
accessibility at defined genomic loci. Mol. Cell 51, 606-617 (2013). 

18. Gardini, A. et al. Integrator regulates transcriptional initiation and pause release 
following activation. Mol. Cell 56, 128-139 (2014). 

19. Yamamoto, J. et al. DSIF and NELF interact with Integrator to specify the correct 
post-transcriptional fate of snRNA genes. Nat. Commun. 5, 4263 (2014). 

20. Albrecht, T. R. & Wagner, E. J. snRNA 3’ end formation requires heterodimeric 
association of integrator subunits. Mol. Cell. Biol. 32, 1112-1123 (2012). 

21. Koch, F. et al. Transcription initiation platforms and GTF recruitment at tissue- 
specific enhancers and promoters. Nature Struct. Mol. Biol. 18, 956-963 (2011). 

22. Yang, L. et al. IncCRNA-dependent mechanisms of androgen-receptor-regulated 
gene activation programs. Nature 500, 598-602 (2013). 

23. Stadelmayer, B. et a/. Integrator complex regulates NELF-mediated RNA 
polymerase II pause/release and processivity at coding genes. Nat. Commun. 5, 
5531 (2014). 

24. Skaar,J.R. etal. The Integrator complex controls the termination of transcription at 
diverse classes of gene targets. Cell Res. 25, 288-305 (2015). 

25. Schaukowitch, K. et al. Enhancer RNA facilitates NELF release from immediate 
early genes. Mol. Cell 56, 29-42 (2014). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We would like to thank J. M. Marinis and M. A. Lazar for technica 
support for GRO-seq experiments. We thank D. Hu in A. Shilatifard’s laboratory for 
performing the SEC ChIP-seq experiments. We thank the Oncogenomics core facility a 
Sylvester Comprehensive Cancer Center for performing high-throughput sequencing. 
We also thank Shiekhattar laboratory members and P.-J. Hamard for support and 
discussions. This work was supported by funds from University of Miami Miller Schoo 
of Medicine, Sylvester Comprehensive Cancer Center and grants RO1 GM078455 and 
RO1 GM105754 (R.S.) from the National Institute of Health. 


Author Contributions F.L.and AG. are co-first authors. R.S., F.L.and A.G. conceived and 
designed the overall project. F.L, A.G.and A.Z. performed the experiments. R.S., F.L.and 
A.G. analysed the data and wrote the paper. 


Author Information High-throughput data are deposited at the Gene Expression 
Omnibus (GEO) under accession number GSE68401. Reprints and permissions 
information is available at www.nature.com/reprints. The authors declare no 
competing financial interests. Readers are welcome to comment on the online version 
of the paper. Correspondence and requests for materials should be addressed to R.S. 
(rshiekhattar@med.miami.edu). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 403 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


No statistical methods were used to predetermine sample size. 

Genome-wide data. High-throughput sequencing data analysed in this study are 
originally described in ref. 18 and are deposited at the Gene Expression Omnibus 
with accession number GSE40632. 

H3K27ac, H3K4mel and p300 data sets from HeLa-S3 cells are available as part 
of the ENCODE project”® and can be retrieved under the following accession 
numbers: GSM733684, GSM798322, GSM93550. Additional experiments are 
deposited at GEO (GSE68401) and include RNA-seq data (chromatin-bound 
RNA, polyadenylated and non-polyadenylated fractions of total RNA) as well 
ChIP-seq experiments (acetylation of H3K27 and occupancy of NELFA). Every 
genome-wide experiment is performed in two independent biological replicates. 
Genome-wide identification of eRNA loci. Peak analysis of RNAPII ChIP-seq 
data after EGF stimulation was performed using HOMER 4.6 (run in ‘factor’ 
mode). Next, we used the BEDtools suite to discard any peak overlapping to: 
(i) all exons from Hg19 UCSC Known Genes (with additional 2 kb surrounding 
every exon); (ii) RNA Genes (from the Hg18 genome annotation table, plus 
additional 1 kb); (iii) tRNA Genes (Hg19, plus additional 1kb). We further 
selected peaks overlapping (+400 bp) with H3K27ac peaks from the ENCODE 
ChIP-seq obtained in HeLa-S3 (GEO GSE31477). The analysis resulted in 2,029 
regions that were further examined for their transcriptional response to EGF. 
Briefly, we centred a 6-kb window at the midst of the RNAPII peak and we used 
HOMER 4.6 to calculate RPKM across the entire eRNA locus using chromRNA- 
seq data before and after EGF induction. We selected a group of 225 EGF-indu- 
cible eRNAs displaying a fold change greater than 2 (ctrl versus EGF) and iden- 
tified the nearest EGF regulated gene (fold change RPKM >1.6). 91 EGF-induced 
enhancer RNAs located within 500 kb from the nearest EGF-responsive protein- 
coding genes were selected for further analysis. 

ChIP-seq data analysis. ChIP-seq data were obtained using HiSeq 2000 and 
NextSeq 500. Reads were aligned to the human genome hg19 using bowtie2”” 
(end to end alignment, sensitive option). Snapshots of raw ChIP-seq data pre- 
sented throughout the figures were obtained as follows: BigWiggle files for every 
ChIP-seq were generated using samtools, bedtools and RseQC’, these tracks were 
then uploaded to the UCSC Genome Browser hg19. 

Clustering, heat maps and average density analysis. ChIP-seq, GRO-seq and 
RNA-seq data were subjected to read density analysis; seqMINER 1.3.3” was used 
to extract read densities at all enhancer loci with the following parameters: 5’ 
extension = 4 kb, 3’ extension = 4kb, no read extension, total bin number = 180 
bins. Mean density profiles were then generated in R 3.0.1 and normalized to 
sequencing depth. Heat maps were generated with ChAsE (http://chase.cs. 
univie.ac.at/), using default parameters, a 10 kb window and 400 bins and with 
ngsplot*’. 

qChIP. ChIP was performed in HeLa cells as already described'*. Cells were cross- 
linked with 1% formaldehyde for 10min at room temperature, harvested and 
washed twice with 1x PBS. The pellet was resuspended in ChIP lysis buffer 
(150mM NaCl, 1% Triton-X 100, 0,7% SDS, 500 B.MDTT, 10mM Tris-HCl, 
5mM EDTA) and chromatin was sheared to an average length of 200-400 bp, 
using a Bioruptor sonication device (20 min with 30s intervals). The chromatin 
lysate was diluted with SDS-free ChIP lysis buffer and aliquoted into single immu- 
noprecipitations of 2.5 X 10° cells each. A specific antibody or a total rabbit IgG 
control was added to the lysate along with Protein A magnetic beads (Invitrogen) 
and incubated at 4 °C overnight. On day 2, beads were washed twice with each of 
the following buffers: Mixed Micelle Buffer (150 mM NaCl, 1% Triton-X 100, 0.2% 
SDS, 20 mM Tris-HCl, 5 mM EDTA, 65% sucrose), Buffer 500 (500 mM NaCl, 1% 
Triton-X 100, 0.1% Na deoxycholate, 25 mM HEPES, 10mM Tris-HCl, 1 mM 
EDTA), LiCl/detergent wash (250 mM LiCl, 0.5% Na deoxycholate, 0.5% NP- 
40, 10mM Tris-HCl, 1mM EDTA) and a final wash was performed with 1X 
TE. Finally, beads were resuspended in 1X TE containing 1% SDS and incubated 
at 65°C for 10 min to elute immunocomplexes. Elution was repeated twice, and 
the samples were further incubated overnight at 65°C to reverse cross-linking, 
along with the untreated input (2.5% of the starting material). After treatment with 
0.5 mg ml * proteinase K for 3h, DNA was purified with Wizard SV gel and PCR 
Clean-up system (Promega). ChIP eluates and input were assayed by real-time 
quantitative PCR in a 20 ul reaction with the following: 0.4 1M of each primer, 
10 ul of iQ SYBR Green Supermix (BioRAD), and 5 il of template DNA (corres- 
ponding to 1/40 of the elution material) using a CFX96 real-time system 
(BioRAD). Thermal cycling parameters were: 3 min at 95 °C, followed by 40 cycles 
of 10s at 95 °C, 20s at 63 °C followed by 30s at 72 °C. 

Subcellular fractionation. Subcellular fractionation was followed as described®, 
with minor changes. The cell lysate was re-suspended in cold lysis buffer with 
0.15% NP-40, and the sucrose buffer was used to isolate nuclei. Glycerol buffer 
(20 mM Tris pH 7.9, 75 mM NaCl, 0.5 mM EDTA, 50% glycerol, 0.85 mM DTT) 
and nuclei lysis buffer (20 mM HEPES pH 7.6, 7.5mM MgCh, 0.2mM EDTA, 


0.3 M NaCl, 1 M urea, 1% NP-40, 1 mM DTT) were used to isolate nucleoplasmic 
fraction and chromatin-bound RNA fraction. Chromatin-bound RNA was iso- 
lated with Trizol protocol. 

RNA isolation for high-throughput sequencing. Total RNA or chromatin- 
bound RNA was extracted using Trizol reagent (Life Technologies). Genomic 
DNA and ribosomal RNA was removed with Turbo DNA-free kit and 
RiboMinus Eukaryote Kit (Life Technologies). The polyA and non-polyA frac- 
tions were isolated by running RNA samples three times through the Oligo(dT) 
Dynabeads (Life technologies) to ensure complete separation. The resulting RNA 
fractions were subjected to strand-specific library preparation using NEBNext 
Ultra Directional RNA Library Prep Kit for Illumina (New England Biolabs). 
Sequencing was performed on Nextseq500 (Illumina). 

ChIP-seq. ChIP-sequencing was performed as previously described®. 1 X 10’ cells 
were crosslinked in 1% formaldehyde for 10 min and sonicated with a Bioruptor to 
obtain chromatin fragments of 200-300 bp. Immunoprecipitation was performed 
overnight with the specific antibodies and Dynabeads Protein A or Protein G 
beads (Life Technologies). Beads were washed and chromatin fragments were 
eluted in TE with 1% SDS at 65°C. After de-crosslinking overnight, DNA was 
extracted using Wizard SV extraction columns (Promega) and Illumina sequen- 
cing libraries were prepared using NEBNext ChIP-seq library per reagent set (New 
England Biolabs) and following manufacturer's instructions. Libraries were 
assayed on a BioAnalyzer (High Sensitivity DNA kit) and sequenced on a 
Nextseq500 (Illumina). 

Antibodies. Chromatin immunoprecipitation was performed with polyclonal 
antibodies against INTS11, INTS9, INTS1 (Bethyl, A301-274A, A300-412A, 
A300-361A). ChIP-seq of NELFA and H3K27ac were performed with goat poly- 
clonal antibodies (Santa Cruz, sc-23599) and rabbit polyclonal antibodies (Abcam, 
ab4729), respectively. 

Antibodies used for immunoblot analysis were: y-tubulin (Santa Cruz, mouse 
monoclonal, sc-17788), CBP80 (Santa Cruz, mouse monoclonal sc-271304), 
INTS1 (Bethyl, rabbit polyclonal, A300-361A) anda proprietary rabbit polyclonal 
raised against the C terminus of INTS11. Flag M2-conjugated beads (Sigma, 
A2220) were used for immunoprecipitation. 

Chromosome conformation capture (3C). 3C assay was performed as prev- 
iously described with minor changes*'. HeLa cells were filtered through a 70 
jum strainer to obtain single cell preparation. 1 X 10’ cells were then fixed in 
1% formaldehyde for 30 min at room temperature for cross-linking. The reaction 
was quenched with 0.25 M glycine and cells were collected by centrifugation at 
240g for 8 min at 4 °C. Cell pellet was lysed in 5 ml cold lysis buffer (10 mM Tris- 
HCl, pH 7.5; 10mM NaCl; 5mM MgCl; 0.1mMEGTA) with freshly added 
protease inhibitors (Roche) on ice for 15 min. Isolated nuclei were collected by 
centrifugation at 400g for 5min at 4°C then re-suspended in 0.5 ml of 1.2x 
restriction enzyme buffer (NEB) with 0.3% SDS and incubated for 1h at 37°C 
while shaking at 900r.p.m. Next, samples were incubated for 1h at 37°C after 
addition of 2% (final concentration) Triton X-100. 400 U of restriction enzyme 
was added to the nuclei and incubated at 37 °C overnight. 10 pl of samples were 
collected before and after the enzyme reaction to evaluate digestion efficiency. 
The reaction was stopped by addition of 1.6% SDS (final concentration) and 
incubation at 65 °C for 30 min while shaking at 900 r.p.m. The sample was then 
diluted 10-fold with 1.15 ligation Buffer (NEB) and 1% Triton X-100 and 
incubated for 1h at 37°C while shaking at 900r.p.m. 400 U of T4 DNA ligase 
(NEB) were added to the sample and the reaction was carried at 16°C for 4h 
followed by 30 min at room temperature. For each sample, 300 jig of Proteinase K 
were added for protein digestion and de-crosslinking at 65 °C overnight. On the 
next day, RNA was removed by adding 300g of RNase and incubating the 
sample for 1 h at 37 °C. DNA was purified twice by phenol-chloroform extraction 
and ethanol precipitation. Purified DNA was then analysed by conventional or 
quantitative PCR. As control for ligation products, the Bac-clones were digested 
with 10U of restriction enzyme overnight and then incubated with 10U T4 
DNA-ligase at 16 °C overnight. The DNA was extracted by phenol-chloroform 
and precipitated with ethanol. Purified DNA was then analysed by conventional 
or quantitative PCR. For real-time PCR, the AC, method was applied for ana- 
lysing data, using the Bac-clone C, values as control. Primer sequences for PCR 
are listed in Supplementary Table 2. Bac clone ID: RP-11-294A10, RP-11- 
1107P14, RP-11-1068G13 (Empire Genomics). 

Pol II RNA immunoprecipitation. RIP was performed as described*’. HeLa cells 
were UV-crosslinked at 254 nm (200 mJ cm *) in 10 ml ice-cold PBS and collected 
by scraping. Cells were incubated in lysis solution (0.1% SDS, 0.5% NP40, 0.5% 
sodium deoxycholate, 400Um121 RNase Inhibitor (Roche)) and protease inhibitor 
at 4°C for 25 min with rotation, followed by DNase treatment (30 U of DNase, 
15 min at 37 °C). Protein A Dynabeads (Invitrogen) were incubated with 2 jig Pol 
II antibody (Santa Cruz, N-20) and the cell lysate at 4°C overnight. The purified 


©2015 Macmillan Publishers Limited. All rights reserved 


protein-RNA complex was extracted using TRIzol method for RNA extraction 
and subjected to RT-qPCR with corresponding primers. 

Inducible cell lines. INTS11 and INTS1 knockdown inducible clones were gen- 
erated from HeLa cells using the Tet-pLKO-puro vector. For EGF induction, cells 
were serum starved in 0.5% FBS for 48h and treated with 100ngml_' EGF 
(Invitrogen) for the indicated time course. All cell lines in this study are myco- 
plasma negative. 

Transfections. Cells were treated with doxycycline for 48 h. 24h before EGF induc- 
tion, INTS11 and INTS11 (E203Q) mutant protein expression plasmids were trans- 
fected using Lipofectamine 2000 (Life Technologies, Inc.) according to the 
manufacturer’s instruction. Cells were harvested 0 and 20 min after EGF induction. 

All the PCR primer sequences are listed in the supplementary Table 2. 


26. 


27. 


28. 
29. 


30. 


31. 


LETTER 


Gerstein, M. B. et a/. Architecture of the human regulatory network derived from 
ENCODE data. Nature 489, 91-100 (2012). 

Langmead, B., Trapnell,C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient 
alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 
(2009). 

Wang, L. G., Wang, S. Q. & Li, W. RSeQC: quality control of RNA-seq experiments. 
Bioinformatics 28, 2184-2185 (2012). 

Ye, T. etal. SeqMINER: an integrated ChIP-seq data interpretation platform. Nucleic 
Acids Res. 39, e35 (2011). 

Shen, L., Shao, N., Liu, X. & Nestler, E. ngs.plot: Quick mining and visualization of 
next-generation sequencing data by integrating genomic databases. BMC 
Genomics 15, 284 (2014). 

Lai, F. et al. Activating RNAs associate with Mediator to enhance chromatin 
architecture and transcription. Nature 494, 497-501 (2013). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Enhancer marks 


eRNAs 
a c d | 
ChromRNA-seq ' 
_ — p300 9° 
= —— H3K4me1 s ! 
g -—- H3K27ac — CTRL ge ! SS 
ire) “| — CTRLEGF 3B ! 
a 
= Eo 
oO - s 
BS ov 
® = —L_ 
- < CTRL CTRL EGF 
4kb Distance from RNAPII peak 4kb L 
b 7 ia e protein coding genes 
A oO 
0 Bo © 
lo — CTRL © 
D ai H3K27ac — CTRLEGF Ee ge 
— 3 
zo 8™ 
oo WN asi 
a go 
a ” 7 c 
= ge” 
oe s+ 
0 Distance from RNAPII peak ia 
a CTRL CTRL EGF 
4kb Distance from RNAPII peak 4kb 
f RNA-seq i 
2 Chri2 NR4A1 89 kb NR4A11 enhancer 
— cTAL —— SE 
— CTRLEGF wee ee ae Oi 
3 EGF — 5kb —— EGE 
g zy) = zy) 
rs = = = Seer er” Re Tee ia Zz 
= ad 19 > 
Geo v v 
3° ae audi 2/- 
5 : 39-0” a 
< 
© 9 7 aor Bie ies. 
re’ D BAe ae emai 8 2 
Zz Zz 
a sal ra 
250° g 3.97 % 
joldh a be i" oe in 
. - [i sepsperget een _ ey ee ae 
Distance from RNAPII peak = Sates + 5 peek ee ey he 
g eRNAs ol al | 
= 340 - 24- 
2. eee oan eine g stein, gn McRitchie Sethe we 2 
fo) [oo ep ree ~—" je} 
8 3 ’ 3 
Bow a7t Bh ag zs 
& 3407 | Paice aa +12 
aS a n 
S a =— ial g — _ Manan abate... B 
CTRL CTRL EGF ae =. 
h aren iinisacinstacendl | Lacie nsilltilcscise sania |Z 
protein coding genes é- o 37r o 
@o 
2 aB6/- “« 486 oe 
ge A____- | | alla | 
S 486 N 486 Ny 
o 
at L Ah a Ps 8 eee Way a 8 
E 133 133 
Ss 
= rT ee ae ha p80 
o 125 - 125/- 
CTRL OTRLEGF Pere yen Cel mae yy) WW 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 1 | Identification eRNAs responsive to EGF. a, We 
identified 91 EGF-responsive enhancer regions in HeLa cells. We annotated 
extragenic RNAPII sites (see Methods) and used the middle of the RNAPII 
peak as an anchor to display average profiles of p300, H3K27ac and H3K4mel 
(data from the ENCODE project). The profiles represent the mean read density 
of ChIP-seq data. The 91 loci display a typical enhancer signature, with 
enrichment of p300 and H3K27ac around the TSS and a broader decoration by 
H3K4mel. b, Profiles of H3K27ac were obtained from ChIP-seq analysis of 
HeLa cells before and after 20 min of EGF induction. Mean read density was 
normalized to sequencing depth. c, EGF stimulates bi-directional transcription 
from 91 enhancer regions. We displayed the mean read density obtained 
from strand-specific sequencing of the chromatin-bound RNA fraction 
(ChromRNA-seq). d, e, Normalized read density (RPKM) was calculated from 
RNA-seq data for 91 eRNAs (d) and 57 neighbouring protein-coding genes 


LETTER 


(e) that responded to EGF stimulation (FC >1.6) and mapped within 500 kb 
from an EGF-responsive eRNA. f, Average profiles of ChromRNA-seq data at 
91 enhancer loci (mean density of reads, normalized to total read number). 
g, h, Box plot of 91 eRNAs before and after treatment with EGF shows the 
average increase of transcription 20 min after stimulation (P < 0.001), matched 
by an increase in the neighbouring protein-coding genes (P < 0.02). i, NR4A1 
is activated by EGF in HeLa cells: RNAPII and INTS11 are recruited to the 
NR4A1 locus after 20 min of stimulation, with concomitant accumulation of 
reads from RNA-seq and ChromRNA-seq. A neighbouring eRNA locus also 
exhibits increased transcription along with RNAPII and INTS11 recruitment. 
Sequencing tracks are visualized in BigWig format and aligned to the hg19 
assembly of the UCSC Genome Browser. Whiskers on the box plots indicate the 
variability in the datasets. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


ag 6 10 
5 2 s = 
S 12 OCCNL1-S a ONR4A1-S 2 8 
2 19 BCCNL1-As oe ENR4A1-As O ODUSP1-S 
= 6 
5 a S 5 EDUSP1-As 
w ND 3 2 
n 7p) w 
o 6 © Q 4 
rol a 2 = 
«x 4 x Q 
a) o a 2 
BE ? 2% © 
a a. 3 
o pa 
x ina Ee 


hexamer Oligod(T) 


hexamer Oligod(T) 


oO 
— © 


sas | ” ere Toren Yee Oa oer aa WR acdetiaccdiaahin npaaes on : a aaa [ —_ ae =—= Egf 
el zl -670 L -100 [ Chrom 
Pl Eh P1800 Fear | RNAse 
sel rym 7 sL — 670 L oe : [ . 
5 12 “42 ci 
¢ phamdia ge [ = es a [ ——————— Egf 
35 25 -2140 L -100 L 
| saul a rr wal. [ seat | pola 
rey : a ee —— poly 
asl 2sb -2140 [ ~y -100 [ 
[ wt! vial, aa . | nae deanect ae salle - [ r ed [ au 
35 7 : : 4 iia -2140 : -100 ~~ RNA-seq 
= ve et teat ott did dha. aa tas 7 - Hh, a = a eee aoe Ml aa non polyA 
ssl rrr 2al myn -2140 L -100 L 


Ce ee el SS 
CCNL1 enhancer DUSP1 enhancer DUSP1 mRNA RNU12 


d 
Polyadenylation of eRNAs vs Coding Genes 


eRNAs protein coding genes 


2 2 

S So 

3 38 

~) OR) 

U ae} 

w w 

® ® 

xc x 

€ g 

= so 

8 28 

a 
° 
non polyA polyA non polyA polyA 

Extended Data Figure 2 | EGF-induced eRNAs are predominantly non- of the UCSC Genome Browser. CCNL1 and DUSP1 enhancers were displayed 
polyadenylated. a, We examined transcription at three enhancers adjacent (b) along with a polyadenylated control (DUSP1 protein-coding locus) and a 
to EGF-responsive genes CCNL1, NR4A1 and DUSP1. Total RNA samples non-polyadenylated transcript (snRNA U12) (c). All EGF-induced eRNAs 
were collected before and after EGF induction. Reverse transcription was and protein-coding genes (RefSeq hg19) were examined for their average 
performed with random hexamer primer or oligo d(T) primer. Each eRNA RPKM throughout the entire locus. d, We compared polyadenylation levels of 
strand was analysed by real-time PCR with specific primers. Error bars 225 eRNAs and 150 protein-coding genes (2 fold induction upon EGF, RPKM 


represent + standard error of the mean (s.e.m., n = 3 biological independent _ calculated from ChromRNA-seq data previously described). The box plot 
experiments). P< 0.01 by two-sided t-test. b, c, RNA-seq was performed on shows predominance of non-polyadenylated transcripts mapping to eRNA 
the polyadenylated and non-polyadenylated fraction of total RNA. RNA-seq _loci, as opposed to transcripts coding for RefSeq genes. Whiskers on the box 
tracks were visualized in BigWig format and aligned to the hg19 assembly plot indicate the variability in the datasets. 


©2015 Macmillan Publishers Limited. All rights reserved 


NR4A1 enhancer 


0.16 O Omin 
0.14 ) = 20min 
9.12 = 40min 

= 60min 
2 0.08 
ss 0.06 
0.04 
0.02 i (| 
0) 
INTS1 INTS9— INTS11 
DUSP1 aes: 
0.12 
0.1 | 
3 0.08 
& - 
se 0.06 
0.04 
0.02 i 
0 | A 
INTS1 INTS9 INTS11_— IgG 
Ints1 Ints11 
- + - + DOX 
INTS1 l= = a 
CBP80[— — — — 
INTS11 eS = =) 


Extended Data Figure 3 | The Integrator complex is recruited to enhancers 
upon EGF stimulation. a, qChIP analysis of Integrator occupancy using 
INTS11, INTS1 and INST9 antibodies at four eRNA loci. Data were collected 
during a time course of EGF induction in HeLa cells (0, 20, 40 and 60 min). 
Error bars represent + standard error of the mean (s.e.m., n = 3 biological 
independent experiments). P < 0.01 by two-sided t-test. b, Depletion of INST1 
and INST11 protein levels in tet-inducible HeLa clones. The arrow indicates the 


LETTER 


DUSP5 enhancer 


0.12 
it 
0.1 
0.08 
0.06 
| : 
0.04 
0.02 I 
INTS1 INTS9— INTS11 IgG 
CCNL1 enhancer 
0.12 
0.1 F 
0.08 
0.06 | | | 
0.04 | 
0.02 
0 | m 
INTS1 INTS9.INTS11._— IgG 
H3K27ac variation 


0.5 1.0 


0.0 


-0.5 


Fold Change (log2) 


CTRL EGF DOX EGF 


INTS11-specific signal; the asterisk shows a non-specific band. c, Fold change 
of H3K27 acetylation (0 min/20 min EGF) before (ctrl) and after (dox) 
depletion of INTS11. Data were calculated from read density of ChIP-seq 
experiments across EGF-induced enhancers. Depletion of Integrator 
significantly affects EGF-dependent increase in H3K27ac (P < 0.05). Whiskers 
on the box plot indicate the variability in the datasets. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


Fold change/ GADPH 


oy 


Fold change/GADPH 


8) —ccnt-s aU —nrAAt- S 
= =CCNLI-S DOX Bi sae BUSES. DOK z == NR4A1- S DOX 
—CCNL1-As —=DUSP1-As Ta} ——NRAA1- As 
67] ==CCNL1-As DOX = DUSP1-As DOX 5 = NR4A1- As DOX 
o 
4 2 
oO 
& 
(e) 
2 eet = 
cp SRR ESSE RW; 2 
0 
0 5 10 15 20 () 5 10 15 20 Mins 20 Mins 
12) —=CCNL1-S —busP1-s 
= =CCNL1-S DOX g | ~~ DUSP1-S Dox = ——NR4A1-S 
10} —cen.1-As —pusP'-As a == NR4A1-S DOX 
g | ~~CONLt-As DOX = DUSP1-As DOX a —NRAAI- As 
6 g == NR4A1-As DOX 
6 D 
4 S 
& 
4 So 
mo} 
2 Ke) 
iD: . 
0 0 
0 5 10 15 20 0 5 10 15 20 Mins 0 5 10Ming5 20 
Chr1 enhancer region —e— 32kb a ATF3 


5. EGF 25 EGF 
g D 5 
pee tee - ve iba oe cles Zz [ es eens 
5. = 25 
+ = + 
ar 7 a ae or wen 
17 2.3 
Lice Pe Pr DTW | STueTEr Tran ve WPF | yy © verre | yr yr) Wren Z eralneee titer on 
rr fe a 2.3 A 
he 
Poe armen Oe meen SN ie Fr aa [uli aibil, * 
362 362 
[ | p300 [ p300 
5. ov on a ay - es ane 4 ieee! Weneerars om a 
bata | H3K27ac 1°57 H3K2 
—— a eae an [ ak 
ar EGF DOX 270 EGF DOX 
ee mek i we “ = . 2% ears ~ e 
s om ee a eo 
‘el al 
ar i ~ « 270 os 
ra Se | a 0 a 
pean - -- a Saiaresanen 
ri y q =e ib 
z > = 
8 o 270 
= =p |e eer. 
i ay ey i 
= a ™ - sore 
vol ‘al 
al + +b sal + + 
ie as seer ines ak ti i 
r aioe T 1 r ee 
mf mi 
aeee -- “- Primer Positions 
e1e2 e3 e4 
e1 - e2 e3 e4 
10 54 TT ATFR2S —ATF3.1-S1 
== ATF3.2-S-shINT11 == ATF3.1-S1-shINT11 
ae ——ATF3.2-As —ATF3.1-As 
) 4 | —= ATF3.2-As-shINT11 == ATF3.1-As-shINT11 
1o) 
3 6 
d 3 
c 
oO 
7 4 2 
z 
[o) 
uw 2 Tere s. 1 
A 
0 
0 5 10 15 20 0 5 10 15 20 
—ATF3.3-1 = ATF3.3-1-shINT11 
—ATF3.3-2 == ATF3.3-2-shINT11 
—ATF3.3-3 = ATF3.3-3-shINT11 
——ATF3.3-4 == ATF3.3-4-shINT11 


©2015 Macmillan Publishers Limited. All rights reserved 


INTS11 shRNAs 


INTS1 shRNAs 


IIdVNY 


LLSLNI 


7ac 


bes-¥NY 


Mins 


Extended Data Figure 4 | Depletion of Integrator impairs activation of 
eRNAs by EGF. a, b, Activation of RNAs near DUSP1, CCNL1 and NR4A1 
genes were assayed by qRT-PCR in three independent experiments, using 
INTS11 (a) or INTS1 (b) inducible shRNA clones. Transcription was followed 
throughout a 20-min time-course experiment. Each eRNA was amplified with 
two different sets of specific primers to analyse both strands; dashed lines 
indicate treatment with doxycycline (dox) to induce shRNAs. Data at every 
time point are reported as fold change (EGF/non-induced). Error bars 
represent + s.e.m. (1 = 3 biological independent experiments), P< 0.01 by 
two-sided t-test. c, Schematic representation of ATF3 and its super-enhancer 


LETTER 


region located 30 kb upstream (top). Snapshots of ChIP-seq and RNA-seq 
tracks show EGF-dependent recruitment of RNAPII and INTS11 at the 
ATF3 locus and at several upstream enhancers. Depletion of INTS11 nearly 
abolished transcription of RNAs and ATF3 mRNA. d, Real-time RT-PCR 
analysis of the ATF3 super-enhancer region upon depletion of INTS11. qPCR 
analysis was performed before and 5, 10, 15, 20 min after EGF treatment with 
strand-specific primer sets (indicated below the RNA-seq tracks in ¢). Error 
bars represent + s.e.m. (n = 3 biological independent experiments), P< 0.01 
by two-sided t-test. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Con1 N1 N2 N3 N4 N5 N6 E Con2 


Relative Interaction Frequencies 
ro) 
ul 


jo) 


ay 
[= 


0.5 


Relative Interaction Frequencies 


0 
Con1 N1 N2 N3 N4 N5 N6 E Con2 
b 10 kb 
Con E D1 D2 D3 D4 D5 
> 
Pst! [sox | ee 3 a 
eRNA DUSP'1 

n 

Oo 

oO 

c 

feb) 

=) 

oO 

© = CTRL 

LL 

c 

Ke) —— CTRL+ EGF 

Oo 

s 

ae) 

= 

®o 

2 

x 

fod) 

ao 

Con E D1 D2 D3 D4 D5 
Extended Data Figure 5 | Chromatin conformation capture at control loci. _ either control sites with the NR4A1 promoter region after EGF induction. 
a, 3C analysis of NR4A1 promoter and control sites. The Con1 site lies b, Similarly, no looping events were detected between the promoter of DUSP1 


74 kb upstream of the NR4A1 protein-coding gene and the Conz2 site is located and a downstream control site (Con). All data were averaged from three 
42 kb downstream of the enhancer site. There are no looping events between —_ independent experiments, P< 0.01 by two-sided t-test. 


©2015 Macmillan Publishers Limited. All rights reserved 


GRO-seq 


— CTRLEGF 
— DOX EGF 
t+ 
N 
= 
D 
Cc 
(0) 
xe) 
Cc ° 
14) 
{) 
a 
al 


2 


Norm. Read Counts 
4 6 
Norm. Read Counts 
4 6 


0 


p<0.004 * p<0.004 


Extended Data Figure 6 | Integrator has a role in eRNA termination. 

a, Mean density profiles of GRO-seq data at 91 EGF-induced enhancers. Data 
are presented as strand-specific mean read density, centred at the middle 

of the RNAPII peak and normalized to sequencing depth. The underlying box 
plots were used to quantify the enrichment of GRO-seq reads at the 3’ end 
of both eRNA transcripts (2 kb window, centred 1 kb downstream of the 
RNAPII peak). b, RNAPII profiling at 91 enhancers after INTS11 depletion 
shows accumulation of ChIP-seq reads towards the 3’ end. Data are presented 
as mean read density, centred at the middle of the RNAPII peak and normalized 


LETTER 


oO 
1.4 


Pa — CTRLEGF 
- RNAPII Bey Ben 
a 
x2) 
Cc 
F wo 
oO 
= 
n 
oO 


Distance from RNAPII peak A4kb 


2 


Norm. Read Counts 
4 
Norm. Read Counts 
5 


100 


— CTRL EGF 
— DOX EGF 


0 RNAPII traveling ratio 1.5 


Percent of eRNAs 
50 


100 


Antisense 


— CTRL EGF 
— DOX EGF 


Percent of eRNAs 
50 


0 RNAPII traveling ratio 1.5 


to sequencing depth. Box plots represent the enrichment of RNAPII reads of 
both eRNA transcripts (2 kb window, centred 1 kb downstream of the RNAPII 
peak). RNAPII significantly accumulated (P < 0.004) after depletion of 
INTS11. Whiskers on the box plots indicate the variability in the datasets. 

c, d, RNAPII travelling ratio at enhancers was measured as the ratio between 
RNAPII density close to the transcription start site (the surrounding 300 bp) 
and 3 kb downstream. Given the bi-directional nature of transcription at 
enhancers, travelling ratio was calculated for both sense (c) and antisense 

(d) transcripts. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


9 ELL2 


4 


== ‘©TRL 
—— CTRLEGF 
s=2 DOX 

— DOX EGF 


3 


1 


ChIP binding 
2 


4kb 4kb 


C shRNAs depletion d 


= 
> 
N 


LT i2 6 
o ® 
Q 1 D5 
6 : 
Ss 0.8 mshRNA GFP 54 
D 0.6 mshRNA NELFA oO 3 
|shRNA NELFE Xe) 
£04 LL 2 
rs) 
o 02 1 
i 0 0 


NELFA NELFE 


e NELFA 


os — CTRLEGF 
= — DOXEGF 
xe) 
£ a 
2 
oH 
=o 
= = 
Q 
° 
° Akb Distance from RNAPII peak 4kb 


Extended Data Figure 7 | Analysis of super elongation complex at 
enhancers. a, b, Metagene analysis on 91 eRNA loci shows the effect of EGF 
stimulation and INTS11 depletion on the recruitment of the ELL2 (a) and AFF4 
(b) subunits of the super elongation complex (SEC). SEC was recruited to 
enhancers upon EGF stimulation. Depletion of Integrator decreases AFF4 and 
ELL2 recruitment. Data were visualized as mean read density, normalized to 
sequencing depth, across 8 kb surrounding the centre of enhancers. c, To 
investigate the role of the negative elongation factor (NELF) in induction of 
eRNAs, we infected HeLa cells with lentiviral shRNAs against NELFA, NELFE 
and a control GFP. Quantitative RT-PCR analysis shows the extent of NELF 


AFF4 


oO 


= “<= CTAL 
= — CTRLEGF 
— --- DOX 
z - — DOX EGF 
2 o 
oO 
cn 
O 


4kb 


4kb 


CCNL1 eRNA DUSP1 eRNA 


GFP NELFA NELFE GFP NELFA NELFE 


DUSP5 eRNA # CTRL 
mB EGF 
a 5 
4 
<= 
O 3 
ae) 
Oo 2 
LL 
1 
0 


GFP NELFA NELFE 


depletion 72h after infection. Error bars represent + s.e.m. (n = 3 biological 
independent experiments), P< 0.01 by two-sided t-test. d, Depletion of two 
different NELF subunits does not significantly impact activation of EGF- 
responsive eRNAs. Data represent fold change of induction (EGF/not induced) 
after 20 min of stimulation and were normalized against GUSB expression. 
Error bars represent + s.e.m. (n = 3 biological independent experiments), 
P<0.01 by two-sided t-test. e, ChIP-seq analysis of NELFA before and after 
depletion of INTS11. Metagene analysis shows mean read density (normalized 
to sequencing depth) across 91 eRNAs. NELF occupancy at enhancers was 
not affected by depletion of Integrator. 


©2015 Macmillan Publishers Limited. All rights reserved 


i) 


Fold Change 


CCNL1 eRNA 
b Cc 


kk 
®@ NR4A1-S 
NR4A1-As 
aK l 


CTRL CTRL+EGF DOX DOX+tEGF 
e f 


6 
4 


CTRL CTRL+EGF DOX DOX+EGF 


16 12 | CCNL1-S 


CCNL1-As 


Fold enrichment 


DATF3.1-S 
BATF3.2-S 


DATF3.2-As 


DATF3.1-As 


Fold enrichment 


Extended Data Figure 8 | Integrator depletion causes accumulation of 
unprocessed eRNAs and prevents release of RNAPII. a, Termination of 
eRNAs was examined with quantitative RT-PCR. Primer pairs were designed 
to amplify a portion of the enhancer transcript detected in normal condition 
(t, total) or a longer template further extending into the 3’of the enhancer 
region (u, unprocessed). qPCR analysis was performed before (ctrl) and after 
(dox) depletion of INTS11 at three eRNAs (sense and antisense strand), 

after stimulation with EGF. In the absence of INTS11, we observed 
accumulation of unprocessed eRNA, suggestive of a termination defect. Error 
bars represent + s.e.m. (n = 3 biological independent experiments), P< 0.01 
by two-sided t-test. Release of eRNA transcripts from RNA polymerase 

was investigated by means of RNAPII immunoprecipitation following UV 


NR4A1 eRNA 


| 20 
8 


CTRL CTRL+EGF DOX DOX+EGF 


kk 
** 


CTRL CTRL+EGF DOX DOX+tEGF 


LETTER 


unprocessed 


total 


DUSP1 eRNA 
d 


ee 
eX 


fj DUSP1-S 
DUSP1-As 


10 


CTRL CTRL+EGF DOX DOX+EGF 


g we 


** 


BATF3.3-1 
OATF3.3-2 
SATF3.3-3 
GATF3.3-4 


CLOMMLLLLMMMN LL LG eo. 


CTRL CTRLtEGF DOX DOX+EGF 


cross-link (UV-RIP). b-d, After RNAPII immunoprecipitation, eRNAs near 
DUSP1, CCNL1 and NR4A1 genes were assayed by qRT-PCR and showed 
increased association with RNAPII in the absence of Integrator. Each eRNA 
was detected by two different sets of specific primers (sense and antisense). 
Error bars represent + s.e.m. (n = 3 biological independent experiments). 
*P<0.01, **P < 0.01, ***P < 0.001 by two-sided t-test. e-g, RNAPII UV-RIP 
analysis was also performed on several eRNAs from the ATF3 super-enhancer. 
qRT-PCR on the RNA recovered after immunoprecipitation shows 
increased association between RNAPII and eRNAs in the absence of Integrator. 
Each eRNA was detected by two different sets of specific primers (sense and 
antisense). Error bars represent + s.e.m. ( = 3 three independent 
experiments). **P < 0.01 by two-sided t-test. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Flag-IP 

CG 
. (oe) 
§ .§ 
8 fF wW 
> = = 
P- _ — 
a 
o = = 


i INTS11 


| y-tubulin 


b Colorkey 


Oe 
0.0 0.2 


GRO-seq 


RNAPII ChIP 


H3K27ac EGF Ft 


Pan | 
-3KD RNAP peak TES +3Kb 


Ss, 


Extended Data Figure 9 | Distribution of RNAPII and nascent RNAs across 
protein-coding genes. a, Expression level of exogenous INTS11 wild type 
(WT) and its catalytic mutant (E203Q). Nuclear extracts were subjected to 
Flag immunoprecipitation and probed with a polyclonal antibody raised 
against the C terminus of INTS11. b, Heat map of nascent RNA (GRO-seq) and 


RNAPII ChIP-seq across the 2,000 most active genes in HeLa cells. Gene loci 
were analysed for their entire gene body, with 3 additional kilobases on both 
ends. H3K27ac data from ENCODE is shown on the left; genes are ranked 
according to the intensity of RNAPII signal. Depletion of Integrator does not 
appear to affect termination at protein-coding genes. 


©2015 Macmillan Publishers Limited. All rights reserved 


od is 


doi:10.1038/nature14880 


Crystal structure of the dynamin tetramer 


Thomas F. Reubold**, Katja Faelber?*, Nuria Plattner*, York Posor‘*+, Katharina Ketel*, Ute Curth’, Jeanette Schlegel’, 
Roopsee Anand!, Dietmar J. Manstein’®, Frank Noé’, Volker Haucke*®, Oliver Daumke”® & Susanne Eschenburg! 


The mechanochemical protein dynamin is the prototype of the 
dynamin superfamily of large GTPases, which shape and remodel 
membranes in diverse cellular processes’. Dynamin forms predo- 
minantly tetramers in the cytosol, which oligomerize at the neck of 
clathrin-coated vesicles to mediate constriction and subsequent 
scission of the membrane’. Previous studies have described the 
architecture of dynamin dimers”’, but the molecular determinants 
for dynamin assembly and its regulation have remained unclear. 
Here we present the crystal structure of the human dynamin 
tetramer in the nucleotide-free state. Combining structural data 
with mutational studies, oligomerization measurements and 
Markov state models of molecular dynamics simulations, we sug- 
gest a mechanism by which oligomerization of dynamin is linked to 
the release of intramolecular autoinhibitory interactions. We elu- 
cidate how mutations that interfere with tetramer formation 
and autoinhibition can lead to the congenital muscle disorders 
Charcot-Marie-Tooth neuropathy‘ and centronuclear myopathy’, 
respectively. Notably, the bent shape of the tetramer explains how 
dynamin assembles into a right-handed helical oligomer of defined 


Outer molecule Interface 1 


G domain a) 


men =, 


COLES 


Interface 1 


Figure 1 | Structure of the dynamin 3 tetramer. The four molecules in the 
tetramer are coloured separately; in the molecule on the right each domain is 
individually coloured. The tetramer consists of two dimers, each formed via the 


Interface 3 
Interface 2 


diameter, which has direct implications for its function in mem- 
brane constriction. 

The three highly conserved vertebrate isoforms of dynamin contain 
five distinct domains (Extended Data Fig. 1a): an N-terminal GTPase 
(G) domain mediating nucleotide binding and hydrolysis, a bundle 
signalling element (BSE), a stalk, a pleckstrin homology (PH) domain 
involved in lipid binding, and a proline-rich domain (PRD) mediating 
interactions with scaffolding proteins containing BAR- and SH3- 
domains*. To exert its function in clathrin-mediated endocytosis 
(CME), dynamin assembles via the stalks into a helical array surround- 
ing the necks of invaginating clathrin-coated pits”*. Dimerization of 
GTP-bound G domains from neighbouring helical rungs induces GTP 
hydrolysis’. The ensuing conformational changes are thought to be 
transmitted from the G domain via the BSE to the stalk, resulting in a 
sliding motion of adjacent helix rungs, concomitant helix constric- 
tion’, and eventually membrane scission. The inherent tendency to 
form large assemblies at high protein concentrations has hampered 
crystallization of dynamin in the past. Previously, the use of non- 
oligomerizing mutants led to the determination of dynamin 1 crystal 


Outer molecule 


Inner molecule 


domain 
G domain 


central interface 2. The two dimers are connected via interfaces 1 (left box) and 
3 (right box) to build the tetramer. One inner molecule is omitted from the 
detailed view for clarity. 


lnstitut fiir Biophysikalische Chemie, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, 30625 Hannover, Germany. @Max-Delbriick-Centrum flir Molekulare Medizin, Kristallographie, Robert- 
Réssle-StraBe 10, 13125 Berlin, Germany. 3Institut fir Mathematik, Freie Universitat Berlin, Arnimallee 6, 14195 Berlin, Germany. *Leibniz-Institut flr Molekulare Pharmakologie, Robert-Réssle-StraBe 10, 
13125 Berlin, Germany. °Forschungseinrichtung Strukturanalyse, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, 30625 Hannover, Germany. “Institut ftir Chemie und Biochemie, Freie 
Universitat Berlin, TakustraBe 6, 14195 Berlin, Germany. +Present address: Cancer Institute, University College London, 72 Huntley Street, London WC1E 6DD, UK. 


*These authors contributed equally to this work. 


404 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


structures~*; however, the postulated higher-order assembly interface 
was not resolved in these structures, leaving the oligomerization 
mechanism unaddressed. We reasoned that an alternative assembly- 
affecting mutation, such as K361S in dynamin 3 (ref. 11), might disturb 
the oligomerization interface to a lesser extent than the previously used 
mutants. We obtained crystals of nucleotide-free dynamin 3(K361S) 
lacking the PRD (dynamin 3(K361S/APRD)) that diffracted to 3.7A 
(Methods, Extended Data Fig. 1, Extended Data Table 1). The asym- 
metric unit of the crystal lattice contained a dynamin tetramer that did 
not form the filamentous superstructures seen for dynamin 1 (refs 2, 3). 

The dynamin tetramer consists of two dimers, each of which 
assembles via the previously described interface 2 (refs 2, 3, 12-14) 
(Fig. 1, Extended Data Fig. 2a, b). Different dimerization and assembly 
models were derived from electron microscopy (EM) reconstruc- 
tions and cross-linking experiments’. These models, however, are 
not compatible with the architecture of the dynamin tetramer 
(Extended Data Fig. 2c). To provide further evidence for dimerization 
via interface 2, we introduced the triple mutation 1481D/H677D/ 
L678S in dynamin 3(APRD) and the corresponding mutation 
(1481D/H687D/L688S) in dynamin 1(APRD). These mutants were 
monomeric in analytical ultracentrifugation experiments (Extended 
Data Fig. 2d). Thus, dimerization via interface 2 is indeed a general 
feature of dynamin and dynamin-like proteins. This conclusion 
receives additional support from recent cross-linking data’. 

Dynamin 3 dimers further assemble into tetramers via interface 1 
and interface 3 (Fig. 1 and Extended Data Fig. 3a). Interface 1 at the top 
of the stalk features four hydrophobic residues that are highly con- 
served in the dynamin superfamily (Fig. 1 and Supplementary Fig. 1). 
The main contributors for interface 3 are loop LIN® (superscript S 
denotes belonging to the stalk) of the ‘inner’ and loop 12° of the ‘outer’ 
stalks (Fig. 1), which mediate an intricate interaction network invol- 
ving all four stalks (Extended Data Fig. 3a). Accordingly, these loop 
regions are well defined in the inter-dimer interface (Extended Data 
Fig. 3b), but not at the outer, non-assembled sides of the tetramer. 
Previous studies have shown that mutation of R399 in loop L2° com- 
pletely destroys higher-order assembly and dynamin function**"’. 
In our structure, R399 of an outer molecule forms salt bridges to 
E410 in the «2° helix and to E345 in LIN® in the outer and inner 
molecules of the opposite dimer, respectively. In the hydrophobic core 
of interface 3, L402 and F403 in L2° of outer molecules interact with 
F493, F496, L655 and T651 of outer molecules in the neighbouring 
dimer (Fig. 1 and Extended Data Fig. 3a). Mutation of F403 and of 
E410 yielded predominantly dimeric protein and compromised 
liposome binding as well as liposome-stimulated GTPase activity 
(Fig. 2a-c). In dynamin 2, the F403A mutation substantially interfered 
with CME, as monitored by transferrin internalization (Fig. 2d). The 
effect of E410A on CME was less pronounced (Fig. 2d), as the struc- 
tural defect may in part be compensated by the second salt bridge that 
R399 forms to E345 (Fig. 1, Extended Data Fig. 3a). 

LIN® of an inner stalk also interacts with «1C®* of an outer stalk. 
Accordingly, mutation of N-terminal ($347A/G348A/D349A), cen- 
tral (Q350A/V351A/D352A/T353A) or C-terminal (L354A/E355A/ 
L356A/S357A) residue stretches in L1N° interfered with tetrameriza- 
tion (Fig. 2a). The central and C-terminal, but not the N-terminal 
mutations compromised liposome binding and assembly-stimulated 
GTPase activity (Fig. 2b, c). The Q350A/V351A/D352A/T353A 
mutant showed a reduced ability to sustain CME of transferrin, 
whereas the L354A/E355A/L356A/S357A mutant displayed a dom- 
inant-negative effect in transferrin uptake assays (Fig. 2d). The 
Charcot-Marie-Tooth neuropathy-related mutation G358R (refs 16, 
17) is located in the C terminus of L1N® (Extended Data Fig. 3c). This 
mutation led to a dimeric mutant that did not bind to liposomes 
(Fig. 2a-c). Likewise, it exhibited a dominant-negative effect on 
CME (refs 16 and 17, and Fig. 2d). The bulky arginine side chain 
probably interferes with the proper binding conformation of L1N®. 
Interestingly, the mutants L354A/E355A/L356A/S357A, G358R and 


a b 
2.079 ee ales 
G358R 
a J -— - =i — 
b 15 L354A/E355A/L356A/S357A ao nan _s cme c — SP 
3. Q350A/V351A/D352N/T353A 5 ay == @= 5= o= 5 = 
oe -_ 
= 104 @rt ¢<€tetaqatc ono <«¢ < 
e2> #2383885 8 8 g 
S347A/G348A/D349A fo) anonnn o + + 
0.54 g gs ss se oo mm 
§ xs = & 
$ 8 8 
“a $ u 
45 6 7 8 9 10 11 12 s $ 
Soow (S) 3 4 
c d 
20- 1:4 
Without liposomes | 
t * 8 1.24 
£ & 1.0-\/aa....--- cs Senecut cate eh! uettess aici: 
~ de, 31087 7 
4007 zg 7 
With liposomes 3 0.6 - 
ca 4 = = T 
SG | samen S 94 
€ 2004 E 1 
7 ie} 
2 J Z 0.24 = 
~ 4 — o ilsl ol Jol [st {al [4] [al [sl [s 
: Tg dest a SR RN (CN re ae | 
S$s5ss55 8 s fo & $8 age5 6 3 8 
686258 9 + + o @ B88EI6 2 F F 
=> fou w _ \ 6 & w 
3 
no 


Q350A/V351A/D352A | 
L354A/E355A/L356A | 
Dynamin 2-eGFP 
S347A/G348A 
Q350A/V351A/D352A 
L354A/E355A/L356A 


Dynamin 2 siRNA 


Scrambled 
siRNA 


Figure 2 | Interface 3 is crucial for assembly and function of dynamin. 

a, Sedimentation velocity experiments for dynamin 3 and the indicated 
mutants. The following molecular masses were obtained for singly sedimenting 
species: Q350A/V351A/D352A/T353A, 162 kDa; L354A/E355A/L356A/ 
S357A, 164 kDa; G358R, 167 kDa. The molecular weight of the dynamin 3 
construct is 86 kDa. S, Svedberg; AU, absorbance units. b, Liposome co- 
sedimentation assays for dynamin 3 and the indicated mutants. S$, supernatant; 
P, pellet fraction; WT, wild type. c, The observed rate of GTPase activity for 
dynamin 3 and the indicated mutants in the absence or presence of liposomes. 
The average of two independent measurements is shown, with deviations 
ranging from 1% to 13%. d, The capacity of dynamin 2 mutants to reconstitute 
defective CME in HeLa cells depleted of endogenous dynamin 2, as monitored 
by fluorescent transferrin uptake. Data shown represent mean + s.e.m., the 
number of independent experiments is indicated in the bar. Sequence QIDT 
(amino acids 350-353) in dynamin 2 corresponds to QVDT in dynamin 3. 
siRNA, short interfering RNA. Raw data for b is available in the Supplementary 
Information. 


F403A were still recruited to clathrin-coated pits; these pits, however, 
remained stable at the membrane surface (Extended Data Fig. 3d). 
Thus, the function of dynamin at clathrin-coated pits, but not its 
recruitment, depends on an intact interface 3. 

The dimers in the dynamin tetramer are asymmetric concerning the 
PH domain and the orientation of the G domain and the BSE (Fig. 1). 
Compared to an outer molecule, G domains of inner molecules are 
tilted by approximately 40° around hinge 1 between the BSE and the 
stalk (Extended Data Fig. 4a). The PH domains of the outer molecules 
bind to a conserved surface of the stalk (Fig. 3a, b), a similar site as in 
dimeric dynamin 1 (ref. 2) (Extended Data Fig. 4b). The assignment of 
the visible PH domains to the outer molecules is unambiguous 
(Extended Data Fig. 4c). The PH domains of the inner molecules were 
not resolved in the electron density (Extended Data Fig. 4d). Modelling 
of the inner PH domains to positions equivalent to those observed for 
the outer molecules leads to clashes with the outer stalks (Fig. 3c and 
Extended Data Fig. 4e). Apparently, the PH domains have to be 
released from their autoinhibitory site for oligomerization to proceed. 
In keeping with this assumption, a dynamin 3 variant lacking the PH 
domain assembled in the absence of membranes and the presence of 
nucleotides more efficiently into regular oligomers than did wild-type 
dynamin 3 (Extended Data Fig. 5a—c). Dynamin 3 tubulated liposomes 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 405 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


aims 


E368K/Q 
R369W/Q 


a 

i et 
aon 
T 


Distance 
E368-R618 (A) 
a 


= 
oO 


a 


Figure 3 | Coupling of autoinhibition and oligomerization. a, Stalks and PH 
domains in the dynamin tetramer as seen in the crystal. The box defines the 
view displayed in c. b, Close-up view of the PH-domain-stalk interface from 
a. Mutations in dynamin 2 implicated in centronuclear myopathy are indicated 
as pink balls; K361 and E355 as purple and black balls, respectively. c, Markov 
state models were constructed from MD simulation data including the stalk 
and PH domain. For each metastable PH domain conformation, three 


on its own and did not need a specific membrane curvature for bind- 
ing. At physiological salt concentrations, dynamin 3 efficiently bound 
to and tubulated unfiltered Folch liposomes (Extended Data Fig. 5d, e). 
This is in line with the presence of a tyrosine in position 596, which has 
been suggested to serve as a determinant for curvature generation 
versus curvature sensing'*. When expressed in mammalian cells, dyna- 
min 2(APH) formed large, presumably cytosolic aggregates that failed 
to co-localize with clathrin and interfered with transferrin uptake in a 
dominant-negative fashion (Extended Data Fig. 5f, g), as has been 
shown previously for dynamin 1 (ref. 19). These results indicate that 
the PH domain has important functions in oligomerization and mem- 
brane binding. 

To investigate this dual function of the PH domain, we inserted the 
mutations R364S, R518H, R518D and E355A into the interface 
between the PH domain and stalk (Extended Data Fig. 6). Similar to 
nearby interface-3 mutants, most of these mutations impeded assem- 
bly, liposome binding, liposome-stimulated GTPase activity and 
transferrin uptake. In contrast, the mutation R518D enhanced oligo- 
merization and GTP hydrolysis, as previously described for mutants in 
the PH-domain-stalk interface*”®. 

Molecular dynamics (MD) simulations were carried out and ana- 
lysed by Markov models*** to characterize the dynamics of the PH- 
domain-stalk interface and its interplay with interface 3 (Fig. 3 and 
Extended Data Fig. 7). The simulations showed that E355 and K361, 
together with R518 and R364, are part of a network of polar inter- 
actions (Extended Data Fig. 7a) that can rapidly interconvert leading 
to three distinct binding modes (Fig. 3c). The preferred binding 
interaction of the PH domain with the stalk was the autoinhibitory 
‘closed’ conformation found in our crystals. In two other ‘open’ con- 
formations, the PH domain was shifted along the stalk to a position 


406 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


10.0 
Time (us) 


12.5 15.0 


representative structures are shown. Clashes between the PH domain and 
assembling stalk (light-blue surface) are indicated by black ovals. Percentages 
indicate the occurrence of each metastable state for wild-type dynamin 3 
(black) and the mutant K361S (red). Open/closed: position of the PH domain 
allows/inhibits oligomerization. d, Example trajectories. The conformational 
dynamics are projected onto the Glu368—Arg618 distance, which illustrates 
opening and closure of the PH domain at the autoinhibited site. 


where it did not interfere with oligomerization, indicating a dynamic 
equilibrium of oligomerization-permissive and non-permissive bind- 
ing modes. The mutation K361S resulted in the appearance of a 
fourth, highly populated conformation that was also autoinhibitory 
for oligomerization, whereas no oligomerization-permissive binding 
modes were detected (Fig. 3c, d). Further MD simulations and 
Markov models of the stalk with a dissociated PH domain indicated 
that the mutant K361S predominantly stabilizes the loop LIN® in a 
conformation that is not adopted in the wild type (Extended Data 
Fig. 7c). Together, these results may explain the reduced oligomeriza- 
tion capacity of K361S, which is dimeric in solution (Extended Data 
Fig. 6c). A set of highly conserved charged residues including K361 
apparently regulates both the autoinhibitory interaction with the PH 
domain and interactions with LIN‘, thereby tightly coupling autoin- 
hibition and oligomerization. 

Comparison of the dynamin tetramer with the filament-like arrange- 
ments observed in the crystal structures of dynamin 1 (refs 2, 3) shows 
that the tetramer is bent, such that the angle between the outer stalks is 
changed by 20° (Figs 1 and 4a). We constructed a dynamin oligomer by 
stepwise addition of tetramers to the free ends of the growing dynamin 
assembly, using the geometry of interface 3 to connect the tetramers. 
This led to a right-handed helix (Fig. 4b), closely matching the dimen- 
sions of the dynamin 1 helix in the non-constricted, nucleotide-free 
state’. These observations indicate that formation of a right-handed 
dynamin helix at the surface of a tubular membrane is an intrinsic 
feature of stalk assembly via interface 3. The bent shape of the tetramer 
appears to dictate the curvature of a membrane tubule around which 
dynamin preferentially oligomerizes**. Constriction of membrane 
tubules to inner diameters smaller than 16 nm requires active GTP 
turnover and the associated G domain interactions across helical turns. 


©2015 Macmillan Publishers Limited. All rights reserved 


BSE-<G’domain 


Stalks 


Lipid tubule 


_ PH domains 
< Se 


Figure 4 | Assembly of the stalks leads to a right-handed dynamin helix. 

a, Bent architecture of the dynamin 3 tetramer. Only the stalk helices are shown 
as cylinders, first dimer in light blue, second dimer in dark blue. b, Assembly 
of dynamin 3 tetramers using the geometry of interface 3 leads to a right- 
handed helix which fits an EM map of the non-constricted dynamin 1 helix” 
(shown in mesh representation). For clarity, only the stalks are displayed. In the 
inset, the G domain and BSE of an inner molecule and the PH domain of 
the adjacent outer molecule are shown in surface representation. 


Comparison of our structure with a recent cryo-EM model of a super- 
constricted dynamin helix”® (Extended Data Fig. 2c) suggests that con- 
striction of the dynamin helix is driven by conformational changes in 
the stalk interfaces. 

The stalks of our helix model fit well into a cryo-EM map of nuc- 
leotide-free dynamin 1 assembled around a lipid tubule** (Fig. 4b), but 
the PH domains and the G domains protrude from the electron density 
(Fig. 4b inset and Extended Data Fig. 8a). Apparently, the PH domains 
are shed from the autoinhibitory stalk interface to bind the membrane 
tubule, whereas the G domains move upwards from their positions. To 
explain the assembly of dynamin, we propose an equilibrium between 
PH domains bound (as seen for the outer molecules) and unbound (as 
for the inner molecules) to their stalks. In the cytosol, this equilibrium 
lies to the autoinhibited tetramer to prevent untimely oligomerization. 
Centronuclear-myopathy-related mutations in the interface between 
the stalk and PH domain (Fig. 3b and Extended Data Fig. 9) shift the 


LETTER 


equilibrium towards the oligomerized state, thereby leading to disease. 
Upon dynamin recruitment by accessory proteins to endocytic sites, 
the equilibrium is driven towards the assembly-competent conforma- 
tion. This hypothesis is supported by studies showing that, in vivo, 
dynamin helices are built by incorporation of dimer or tetramer units 
rather than larger preformed dynamin assemblies””*. Further inter- 
actions, which may influence the assembly equilibrium, occur between 
the BSE and stalk, or the G domain and PH domain of adjacent dimers 
(Extended Data Fig. 8b-d). In this view, the effect of the disease-rel- 
evant mutation R465W may be explained. In the tetramer, R465 of an 
outer molecule is in close vicinity to the inner BSE of an adjacent dimer 
and a tryptophan at this position is likely to modify this interaction 
resulting in enhanced oligomerization. 

A striking feature of dynamin assembly is the multitude of interac- 
tions in all four molecules of the tetramer (Extended Data Fig. 3a). Our 
results indicate that these contacts are not necessarily static, but are 
characterized by a dynamic equilibrium of different binding confor- 
mations. The formation of new interactions during assembly is com- 
pensated for by the release of autoinhibitory contacts in the dynamin 
tetramer. Such an assembly mode that involves many low-affinity 
interaction sites facilitates reversibility and allows regulation, for 
example through nucleotide binding, hydrolysis or phosphorylation”. 
It is the basis for the particular interaction mode of the semi-solid 
dynamin polymer with its protein and membrane environment, which 
has been previously identified in other CME proteins and has been 
coined as ‘matricity*®. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 18 December 2014; accepted 3 August 2015. 
Published online 24 August 2015. 


1. Ferguson, S. M. & De Camilli, P. Dynamin, a membrane-remodelling GTPase. 
Nature Rev. Mol. Cell Biol. 13, 75-88 (2012). 

2. Faelber, K. et al. Crystal structure of nucleotide-free dynamin. Nature 477, 
556-560 (2011). 

3. Ford, M.G., Jenni, S. & Nunnari, J. The crystal structure of dynamin. Nature 477, 
561-566 (2011). 

4. Cowling, B. S., Toussaint, A., Muller, J. & Laporte, J. Defective membrane 
remodeling in neuromuscular diseases: insights from animal models. PLoS Genet. 
8, €1002595 (2012). 

5. Durieux, A. C., Prudhon, B., Guicheney, P. & Bitoun, M. Dynamin 2 and human 
diseases. J. Mol. Med. 88, 339-350 (2010). 

6. Daumke, O., Roux, A. & Haucke, V. BAR domain scaffolds in dynamin-mediated 
membrane fission. Cell 156, 882-892 (2014). 

7. Hinshaw, J. E. & Schmid, S. L. Dynamin self-assembles into rings suggesting a 
mechanism for coated vesicle budding. Nature 374, 190-192 (1995). 

8. Takei, K., McPherson, P. S., Schmid, S. L. & De Camilli, P. Tubular membrane 
invaginations coated by dynamin rings are induced by GTP-gamma S in nerve 
terminals. Nature 374, 186-190 (1995). 

9. Chappie, J. S. et al. A pseudoatomic model of the dynamin polymer identifies a 
hydrolysis-dependent powerstroke. Ce// 147, 209-222 (2011). 

10. Faelber, K. et al. Structural insights into dynamin-mediated membrane fission. 
Structure 20, 1621-1628 (2012). 

11. Ramachandran, R. etal. The dynamin middle domain is critical for tetramerization 
and higher-order self-assembly. EMBO J. 26, 559-566 (2007). 

12. Frohlich, C. et al. Structural insights into oligomerization and mitochondrial 
remodelling of dynamin 1-like protein. EMBO J. 32, 1280-1292 (2013). 

13. Gao, S. et al. Structure of myxovirus resistance protein a reveals intra- and 
intermolecular domain interactions required for the antiviral function. /mmunity 
35, 514-525 (2011). 

14. Gao, S. et al. Structural basis of oligomerization in the stalk region of dynamin-like 
MxA. Nature 465, 502-506 (2010). 

15. Srinivasan, S., Mattila, J. P. & Schmid, S. L. Intrapolypeptide Interactions between 
the GTPase Effector Domain (GED) and the GTPase Domain Form the Bundle 
Signaling Element in Dynamin Dimers. Biochemistry 53, 5724-5726 (2014). 

16. Koutsopoulos, O. S. et al. Mild functional differences of dynamin 2 mutations 
associated to centronuclear myopathy and Charcot-Marie Tooth peripheral 
neuropathy. PLoS ONE 6, e27498 (2011). 

17. Sidiropoulos, P. N. et a, Dynamin 2 mutations in Charcot-Marie-Tooth 
neuropathy highlight the importance of clathrin-mediated endocytosis in 
myelination. Brain 135, 1395-1411 (2012). 

18. Liu, Y. W. et al. Differential curvature sensing and generating activities of dynamin 
isoforms provide opportunities for tissue-specific regulation. Proc. Nat! Acad. Sci. 
USA 108, E234-E242 (2011). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 407 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


19. Vallis, Y. et al. Importance of the pleckstrin homology domain of dynamin in 
clathrin-mediated endocytosis. Curr. Biol. 9, 257-263 (1999). 

20. Kenniston, J. A. & Lemmon, M. A. Dynamin GTPase regulation is altered by PH 
domain mutations found in centronuclear myopathy patients. EMBO J. 29, 
3054-3067 (2010). 

21. Bowman, G. R., Pande, V. S. & Noé, F. (eds) An Introduction to Markov State Models 
and Their Application to Long Timescale Molecular Simulation. (Springer, 2014). 

22. Schitte, C. & Sarich, M. Metastability and Markov models in molecular dynamics: 
modeling, analysis, algorithmic approaches. In Courant Lecture Notes Vol. 24 
(American Mathematical Society, 2013). 

23. Kohlhoff, K. J. et al. Cloud-based simulations on Google Exacycle reveal ligand 
modulation of GPCR activation pathways. Nature Chem. 6, 15-21 (2014). 

24. Chen, Y. J., Zhang, P., Egelman, E. H. & Hinshaw, J. E. The stalk region of dynamin 
drives the constriction of dynamin tubes. Nature Struct. Mol. Biol. 11, 574-575 
(2004). 

25. Roux, A. et al. Membrane curvature controls dynamin polymerization. Proc. Natl 
Acad. Sci. USA 107, 4141-4146 (2010). 

26. Sundborger, A.C. et al. A dynamin mutant defines a super-constricted pre-fission 
state. Cell Rep. 8, 734-742 (2014). 

27. Cocucci, E., Gaudin, R. & Kirchhausen, T. Dynamin recruitment and membrane 
scission at the neck of a clathrin-coated pit. Mol. Biol. Cell 25, 3595-3609 (2014). 

28. Grassart, A. et al. Actin and dynamin2 dynamics and interplay during clathrin- 
mediated endocytosis. J. Cel! Biol. 205, 721-735 (2014). 

29. Graham, M. E., O’Callaghan, D. W., McMahon, H. T. & Burgoyne, R. D. Dynamin- 
dependent and dynamin-independent processes contribute to the regulation of 
single vesicle release kinetics and quantal size. Proc. Natl Acad. Sci. USA 99, 
7124-7129 (2002). 

30. Schmid, E. M. & McMahon, H. T. Integrating molecular and network biology to 
decode endocytosis. Nature 448, 883-888 (2007). 


Supplementary Information is available in the online version of the paper. 


408 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


Acknowledgements This project was supported by grants from the Deutsche 
Forschungsgemeinschaft (MA1081/8-2 to D.J.M.; SFB740/D7 and SFB958/A04 to 
F.N.; SFB740/C8 and SFB 958/A7 to V.H.; SFB 740/C7 and SFB958/A12 to 0.D.; and 
ES410/2-1 to S.E.), an ERC consolidator grant (ERC-2013-CoG-616024 to O.D.), an 
ERC starting grant (pcCell to F.N.) and a grant from the Einstein Foundation Berlin 
(SOoPIC to N.P.). T.F.R. acknowledges partial financial support by the Cluster of 
Excellence REBIRTH (DFG EXC 62/1). We thank B. Purfiirst for help with electron 
microscopy; S. Hertel, L. Litz, P. Straub and S. Wohlgemuth for experimental assistance; 
and the staff at beamlines XO6SA (PXI) and XO6DA (PXIII) at the Swiss Light Source 
(Villigen, Switzerland) for help during data collection. We thank Y.-W. Liu for 
discussions, and A. Wittinghofer for his support and discussions in the initial stages of 
the project. 


Author Contributions T.F.R. grew the crystals and collected data; K.F. solved the 
structure; T.F.R., K.F. and S.E. refined the structure; T.F.R. and R.A. purified protein for 
crystallization and monomeric dynamin; K.F. and J.S. purified all other proteins, 
performed liposome co-sedimentation, EM and GTPase assays; U.C. performed and 
analysed analytical ultracentrifugation experiments; Y.P. and K.K. performed 
ransferrin uptake assays; N.P. and. F.N. conducted and analysed molecular modelling 
and molecular dynamics simulations. N.P. and Y.P. contributed equally to this work. 
T.F.R., K.F., F.N., V.H., O.D. and S.E. interpreted structural data. T.F.R., K.F., Y.P.,N.P., U.C., 
DJ.M., F.N., V.H.,O.D. and S.E. designed the research. T.F.R., K.F., F.N.,O.D. and S.E. wrote 
he manuscript. 


Author Information The atomic coordinates and structure factors of human dynamin 3 
have been deposited in the Protein Data Bank (PDB) with accession number 5A3F. 
Reprints and permissions information is available at www.nature.com/reprints. The 
authors declare no competing financial interests. Readers are welcome to comment on 
he online version of the paper. Correspondence and requests for materials should be 
addressed to K-F. (katja.faelber@mdc-berlin.de), O.D. (oliver.daumke@ma«c-berlin.de), 
or S.E. (Eschenburg.Susanne@mbh-hannover.de). 


©2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


Protein expression and purification. Human dynamin 3 (splice form abb*', 
residues 1-754) and indicated mutants of this construct were expressed from 
the pProEx-HTb vector (Invitrogen) as N-terminal Hisg-tag fusion followed by 
a tobacco etch virus (TEV) cleavage site. The crystallized construct contained 
the K361S mutation. Proteins were produced in Escherichia coli host strain 
BL21(DE3), and expression was induced by addition of 0.1 mM isopropyl-B-p- 
thiogalactopyranoside. Cells were grown over night at 20 °C in terrific broth med- 
ium. The following procedure was used for purification of dynamin 3(K361S) for 
crystallization. Cells were resuspended in buffer A300 (50 mM HEPES-NaOH 
(pH 7.5), 300mM NaCl, 15mM imidazole and 2mM MgCl.) including 1mM 
phenylmethylsulfonyl fluoride and 0.1% v/v NP-40, and disrupted by sonification. 
Cleared lysates (30,000g, 1h, 4 °C) were applied to a Ni?*-NTA column (Qiagen). 
The column was sequentially washed with buffer A300 and with buffer 
A100 (100mM NaC)). Protein was eluted with buffer A100 containing an addi- 
tional 285 mM imidazole. Fractions containing human dynamin 3 were pooled 
and diluted with an equal volume of 50mM HEPES-NaOH (pH 7.5). The 
diluted protein was loaded onto a HiLoad SuperQ anion exchange column (GE 
Healthcare) equilibrated with buffer B50 (in which 50 refers to the NaCl concen- 
tration) containing 50 mM HEPES-NaOH (pH 7.5), 50 mM NaCl, 2mM DTE and 
1mM MgCl,. After washing with buffer B50, bound proteins were eluted with a 
linear gradient from 50 to 500 mM NaCl. Fractions containing human dynamin 3 
were pooled, 1 mg TEV per 10 mg dynamin 3 was added, and the protein incubated 
onice for 4h. The solution was concentrated using 50 kDa molecular weight cut-off 
concentrators (Amicon) and applied onto a Superdex 200 gel filtration column (GE 
Healthcare) equilibrated with buffer B100. Fractions containing dynamin 3 were 
pooled, concentrated and flash-frozen in liquid nitrogen. 

Wild-type and mutant dynamin 3 used for biochemical and biophysical 

assays were expressed in E. coli Rosetta2-BL21-DE3 in autoinduction medium 
(Novagen) and purified using a Co”* -Talon column, followed by overnight TEV 
cleavage (4 °C, 30 1g per 1 mg fusion protein), dilution/concentration in concen- 
trators for imidazole removal and a second Co”* -Talon column run for Hisg-TEV 
and uncleaved Hisg-dynamin capture. Finally, the peak fractions from a Superdex 
200 gel filtration containing dynamin were pooled, concentrated to maximal 
20 mg ml“! and flash frozen in liquid nitrogen. The purification buffer contained 
20mM HEPES-NaOH, pH 7.8, 500 mM NaCl, and 2mM MgCl, (plus 100 mM 
imidazole for elution, plus 2.8 mM {-mercaptoethanol during TEV cleavage). The 
purified protein was nucleotide-free, as confirmed by high-performance liquid 
chromatography analysis (see below for details). 
Crystallization and structure determination. Crystallization trials by the sitting- 
drop vapour-diffusion method were performed at 4°C using a mosquito LCP 
pipetting robot (TTP Labtech) and Rock Imager storage system (Formulatrix). 
Human dynamin 3 in 150 nl volumes at a concentration of 20 mg ml’ was mixed 
with an equal volume of reservoir solution from commercially available prefor- 
mulated screens (Qiagen). On a preparative scale, 2 ul of protein solution was 
mixed with 2 pl of reservoir solution containing 100 mM MES-NaOH (pH 6.5) 
and 15% 2-methyl-2,4-pentanediol. Crystals appeared after three to five days and 
reached final dimensions of up to 0.5 mm X 0.3 mm X 0.3 mm. Crystals were 
cryoprotected by immersion in reservoir solution added with increasing amounts 
of ethylene glycol with the final solution containing 17% v/v ethylene glycol. 
Cryoprotected crystals were flash-cooled in liquid nitrogen. Data were recorded 
at beamline PXI-X06SA at the Swiss Light Source (Villigen, Switzerland). Native 
data from a single crystal was processed and scaled using the program package 
XDS*. The structure was solved by molecular replacement with Phaser** using the 
structure of the nucleotide-free rat dynamin 1 G domain (2AKA), the stalk of 
human dynamin (3SNH) and the human PH domain (1DYN) as search models. 
The model was built using Coot™ and iteratively refined using Phenix® with 
noncrystallographic symmetry (NCS) between the outer and the inner molecules, 
respectively, with reference model restraints against an artificial dynamin con- 
struct composed of the high resolution search model domains, and with one 
translation-libration-screw (TLS) group per domain. 

Owing to weak electron density, all residues of the G domains of the inner 
molecules were chopped at the CB atoms, and the whole domains were refined 
as rigid bodies. In the final model, the outer molecules have disordered regions in 
the LINS loop, the L1” and L2°” loops and the L5? loop, and the inner molecules 
in the hinge 1 region and the L2° loop. Furthermore, the complete PH domains of 
the inner molecules are not resolved in the electron density. The structure was 
refined to Rwork/Rfree Of 23.2%/27.8%. Of all residues, 94.5% are in the most 
favoured regions of the Ramachandran plot and 0.6% (15 out of 2,500) of residues 
are in the disallowed regions, as analysed with Molprobity*’. Figures were prepared 
with PYMOL*’”. Domain superpositions were performed with Isqkab**. Sequences 
were aligned using Clustal W” and adjusted by hand. The model of the 


LETTER 


right-handed dynamin 3 helix was fitted manually into the EM map using 
PyMOL” and Chimera”. 

Analytical ultracentrifugation. Sedimentation velocity experiments were carried 
out in a ProteomeLab XL-I analytical ultracentrifuge (Beckmann Coulter) at 
35,000 r.p.m. and 20°C using an An-50 Ti rotor. Concentration profiles were 
measured using the manufacturer’s data acquisition software ProteomeLab XL-I 
Version 6.0 (Firmware 5.7) with the absorption scanning optics at 280 nm. 
Sedimentation velocity analysis was performed in a buffer containing 
0.15 M NaCl, 50mM HEPES-NaOH (pH 7.5) in 3 or 12mm standard double 
sector centrepieces filled with a 100 ul or 400 pl sample, respectively. For data 
analysis, a model for diffusion-deconvoluted differential sedimentation coefficient 
distributions (continuous c(s) distributions) implemented in the program 
SEDFIT“' was used. For proteins sedimenting as a single species, molecular masses 
were obtained from c(s) analysis as calculated from the s-value and diffusion 
broadening of the sedimenting boundary. Dynamin 3(K361S/R399A/APRD) 
mutant, analysed in a concentration range from 4 to 23 LM, showed a single peak 
in c(s) distributions with a sedimentation coefficient slightly decreasing with 
increasing protein concentration (data not shown). Owing to hydrodynamic 
non- ideality, this is expected for a protein that does not change its oligomerization 
state with concentration”’. Extrapolation to zero concentration yielded so, = 6.4 
S and a molecular mass of 160 kDa was obtained from c(s) analysis. Since the 
molecular mass of the monomer as calculated from the amino acid composition is 
86 kDa, this mutant forms dimers in solution. For comparison, all other mutants 
were analysed at a concentration of about 20 11M. The following molecular masses 
were obtained from c(s) analyses of mutants that sedimented as a single species: 
Q350A/V351A/D352A/T353A, 162 kDa; L354A/E355A/L356A/S357A, 164 kDa; 
G358R, 167 kDa. 

Partial specific volume, buffer density and viscosity were calculated from amino 
acid and buffer composition, respectively, by the program SEDNTERP* and were 
used to correct experimental s-values to so. Figures were prepared using the 
program GUSSI (http://biophysics.swmed.edu/MBR/software.html, provided by 
C. Brautigam). 

Liposome co-sedimentation assays. Liposomes were prepared as previously 
described (http://www.endocytosis.org). Folch liposomes (total bovine brain lipids 
fraction I from Sigma) in 20 mM HEPES-NaOH (pH 7.5), 100 mM NaCl were 
extruded 13 times through a 0.1 1m filter. The resulting 0.2 mg ml‘ liposomes 
were incubated at room temperature with 4.0 1M of the indicated dynamin 3 
construct for 10 min in 40 ul reaction volume, followed by a 213,000g spin for 
10 min at 20 °C. The final reaction buffer contained 25 mM HEPES-NaOH pH 7.5, 
140 mM NaCl, 2mM MgCl, and 1 mM KCl. 

GTP hydrolysis assay. GTPase activities of 1 11M of the indicated dynamin con- 
structs were determined at 37°C in 25mM HEPES-NaOH (pH 7.5), 130mM 
NaCl, 2mM MgCl, and 1 mM KCl, in the absence and presence of 0.1 mg ml! 
0.1-jum filtered Folch liposomes, using saturating concentrations of GTP as sub- 
strate (1 mM for the basal and 3 mM for the stimulated reactions). Reactions were 
initiated by the addition of protein to the reaction. At different time points, 
reaction aliquots were 15-fold diluted and quickly transferred to liquid nitrogen. 
Nucleotides in the samples were separated via a reversed-phase Hypersil ODS-2 
C18 column (250 X 4mm), with 100 mM potassium phosphate buffer (pH 6.5), 
10 mM tetrabutylammonium bromide and 7.5% acetonitrile as running buffer. 
Denatured proteins were adsorbed on a C18 guard column. Nucleotides were 
detected by absorption at 254 nm and quantified by integration of the correspond- 
ing peaks. Rates were derived from a linear fit to the initial reaction (<20% GTP 
hydrolysed). 

Electron microscopy. For electron microscopic studies (Zeiss EM910), 2 uM 
dynamin 3 in 25mM HEPES-NaOH (pH 7.5), 60 or 150mM NaCl, 2mM 
MgCl, 1mM KCl and 1 mM guanosine-5’ -[(B,7)-methyleno]triphosphate were 
incubated at room temperature for 20h without liposomes or for 20 min with 
liposomes. The final concentration of unfiltered liposomes was 0.35mgml '. 
Samples were spotted on carbon-coated copper grids (Plano GmbH) and nega- 
tively stained with 3% uranyl acetate. 

Transferrin uptake in HeLa cells. HeLa cells (ATCC no. CCL2, identified by their 
short tandem repeat and isoenzyme profiles) were obtained from ATCC 
(American Type Culture Collection) and tested for mycoplasma contamination 
on a routine basis. Cells were cultured for 20-25 passages without further authen- 
tication before starting a fresh culture from the immediately ATCC-derived stock. 
HeLa cells were transfected with siRNA using oligofectamine (Invitrogen). The 
sequence of the siRNA used to target human dynamin 2 was 5’-GCAACUGA 
CCAACCACAUC-3’ (nucleotides 849-867). After 24h, cells were transfected 
with pEGFP-N1 (Clontech) or siRNA-resistant rat dynamin 2-pEGFP-N1 using 
lipofectamine 2000 (Invitrogen). 72 h after siRNA-transfection, cells were serum- 
starved for 1h and incubated with 15 jg ml’ transferrin—-Alexa 647 (Molecular 
Probes, Invitrogen) for 10 min at 37 °C. On ice, cells were washed once with cold 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


PBS and 10mM MgCl, and once for 90s with 0.1 M acetic acid (pH 5.3) and 
200 mM NaCl to remove surface-bound transferrin. After two washes with cold 
PBS and 10 mM MgCl, cells were detached from the culture dish by incubating for 
5 min on ice with 0.1% Pronase E solution in PBS and 0.5 mM EDTA. Cells were 
resuspended in 1% bovine serum albumin (BSA) in PBS, pelleted at 300g for 5 min 
at 4°C and then fixed in 4% paraformaldehyde, 4% sucrose in PBS for 15 min on 
ice and a further 20 min at room temperature. Cells were pelleted and resuspended 
in 1% BSA in PBS and analysed by flow cytometry using a BD FACScalibur. 
Transferrin fluorescence in GFP-positive cells was quantified and normalized to 
cells rescued with wild-type dynamin 2-eGFP. No statistical methods were used to 
predetermine sample size. 
Localization of dynamin 2-eGFP mutants and analysis of clathrin-coated pit 
dynamics. HeLa cells depleted of endogenous dynamin 2 as described above were 
co-transfected with plasmids encoding eGFP or dynamin 2-eGFP and mRFP- 
clathrin-light-chain. 72h after siRNA transfection, cells were analysed by total 
internal reflection fluorescence (TIRF) microscopy using a Nikon Eclipse Ti 
(Andor sCMOS camera, Okolab incubator, Nikon PerfectFocus autofocus system, 
60X TIRF-objective, operated by open source ImageJ-based Micromanager soft- 
ware“). For live imaging, cells growing on glass coverslips were kept in Hank’s 
balanced salt solution with 5% fetal bovine serum. From 180s dual-colour TIRF 
recordings with a frame rate of 0.5 Hz, kymographs were created by selecting a line 
of pixels from an individual cell and depicting this line over the duration of 90 
frames. 
Molecular dynamics simulation and modelling. Molecular dynamics (MD) 
simulations of the stalk and the pleckstrin homology (PH) domain (residues 
322 to 710) were carried out for the wild type and the K361S mutant, each using 
three different setups: (1) the crystal structure coordinates of chain C superim- 
posed to chain B were taken as starting point; (2) starting from setup 1, the PH 
domain was moved 5A away from the stalk; and (3) the PH domain was absent. 
Setups 1 and 2 were used in order to study the conformational equilibrium of 
stalk-PH domain interactions. The loops joining the stalk to the PH domain 
(residues 495 to 511 and 628 to 640) were generated for each setup and mutant 
using VMD (visual molecular dynamics)** and were minimized and equilibrated 
separately. The aim of the setup 3 was to study the intrinsic conformational 
dynamics of the LIN® loop when the PH domain is dissociated from the stalk. 
The coordinates of each setup and for each mutant were used to construct an all- 
atom molecular model and run MD simulations in explicit solvent with 
GROMACS* using the CHARMM27 force field*”. For the setup and equilibration 
procedure, hydrogen atoms were added based on the heavy atom coordinates 
followed by an initial energy minimization. The protein was then solvated in a 
water box with a solvation layer of at least 10 A, resulting in an overall system of 
between 70,000 and 80,000 atoms (depending on the initial structure). Na‘ and 
Cl ions (100 mM) were added to buffer the system and obtain an overall neutral 
simulation cell. The solvated and ionized system was again minimized and equili- 
brated in the NVT (canonical) ensemble at 300 K with position constrains on the 
protein heavy atoms. A second equilibration was carried out in the NPT (iso- 
thermal-isobaric) ensemble, again with position constrains, followed by a 1 ns 
equilibration without constrains. The equilibrated coordinates and velocities were 
used as the starting point for twenty 100-ns production runs for each setup and 
with both wild type and mutant K3615, giving rise to a total of 12 microseconds of 
molecular dynamics data. 
Analysis with Markov state models. The conformations of the LIN® loop and the 
stalk-PH domain patterns can be well characterized by their hydrogen bonding 
patterns within the loop or between stalk and PH domains. Here, 21 residue pairs, 
shown in Extended Data Fig. 7d, were selected that can form hydrogen bonds or 
salt bridges. The Co distances of these residue pairs were evaluated in order to 
obtain a low-dimensional representation of the respective configuration. These 
distances were used to build Markov state models*!*“*“ of setup 3 (LIN® loop) 


and using both setups 1 and 2 for the PH-domain-stalk interactions using the 
EMMA program (http://pyemma.org)°°. The microstates of the Markov state 
models were obtained by regular spatial clustering in the distance space**. The 
distance cutoff for the regular spatial clustering was chosen to obtain around 500 
microstates. Using a cutoff of 10.75 A for the wild type and a cutoff of 9 A for the 
mutant K361S resulted in 550 and 584 microstates, respectively. The lag-time- 
dependent relaxation timescales, indicating approximate Markovianity*! at lag 
times of 20 ns or larger, are shown in Extended Data Fig. 7. Reversible transition 
matrices were then estimated at a lag time of 20ns. The microstates of each 
Markov state model were clustered into a set of three to four metastable states 
using the robust Perron cluster analysis*’. At this resolution, the metastable states 
sampled by the different mutants can be clearly associated between wild type and 
mutant, as shown in Fig. 3c. The Markov model was used to generate random 
trajectories shown in Fig. 3d as described in ref. 23. 


31. Cao,H., Garcia, F.& McNiven, M.A. Differential distribution of dynamin isoforms in 
mammalian cells. Mol. Biol. Cell 9, 2595-2609 (1998). 

32. Kabsch, W. XDS. Acta Crystallogr. D 66, 125-132 (2010). 

33. McCoy,A.J. etal. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658-674 
(2007). 

34. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of 
Coot. Acta Crystallogr. D 66, 486-501 (2010). 

35. Adams, P. D. et al. The Phenix software for automated determination of 
macromolecular structures. Methods 55, 94-106 (2011). 

36. Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular 
crystallography. Acta Crystallogr. D 66, 12-21 (2010). 

37. The PyMOL Molecular Graphics System. Version 1.7.0.1. (Schrédinger, LLC). 

38. Kabsch, W. A solution for the best rotation to relate two sets of vectors. Acta 
Crystallogr. A 32, 922-923 (1976). 

39. Larkin, M. A. et al. Clustal W and Clustal X version 2.0. Bioinformatics 23, 
2947-2948 (2007). 

40. Pettersen, E.F. etal. UCSF Chimera-a visualization system for exploratory research 
and analysis. J. Comput. Chem. 25, 1605-1612 (2004). 

41. Schuck, P. Size-distribution analysis of macromolecules by sedimentation velocity 
ultracentrifugation and lamm equation modeling. Biophys. J. 78, 1606-1619 
(2000). 

42. Laue, T. M. & Stafford, W. F. Ill. Modern applications of analytical 
ultracentrifugation. Annu. Rev. Biophys. Biomol. Struct. 28, 75-100 (1999). 

43. Laue, M.T., Shah, B. D., Ridgeway, T. M. & Pelletier, S. L. in Analytical 
Ultracentrifugation in Biochemistry and Polymer Science (eds Harding, S. E. et al.) 
90-125 (Royal Society of Chemistry, 1992). 

44. Edelstein, A. et al. Computer control of microscopes using pManager. Curr. Protoc. 
Mol. Biol. http://dx.doi.org/10.1002/0471142727.mb1420592 (2010). 

45. Humphrey, W., Dalke, A. & Schulten, K. VMD: visual molecular dynamics. J. Mol. 
Graph. 14, 33-38 (1996). 

46. Lindahl, E., Hess, B. & van der Spoel, D. GROMACS 3.0: a package for molecular 

simulation and trajectory analysis. J. Mol. Model. 7, 306-317 (2001). 

47. Mackerell, A. D. Jr, Feig, M. & Brooks, C. L. Ill. Extending the treatment of 

backbone energetics in protein force fields: limitations of gas-phase quantum 

mechanics in reproducing protein conformational distributions in molecular 

dynamics simulations. J. Comput. Chem. 25, 1400-1415 (2004). 

48. Prinz, J.-H. et al. Markov models of molecular kinetics: Generation and validation. 

J. Chem. Phys. 134, 174105 (2011). 

49. Stanley, N., Esteban-Martin, S. & De Fabritiis, G. Kinetic modulation of a disordered 

protein domain by phosphorylation. Nature Commun. 5, 5272 (2014). 

50. Senne, M. et al. EMMA: A Software Package for Markov Model Building and 
Analysis. J. Chem. Theory Comput. 8, 2223-2238 (2012). 

51. Swope, W. C., Pitera, J. W. & Suits, F. Describing protein folding kinetics by 
molecular dynamics simulations. 1. Theory. J. Phys. Chem. B 108, 6571-6581 
(2004). 

52. Deuflhard, P. & Weber, M. Robust Perron cluster analysis in conformation 
dynamics. Linear Algebra Appl. 398, 161-184 (2005). 

53. Chappie, J. S. & Dyda, F. Building a fission machine-structural insights into 
dynamin assembly and activation. J. Cell Sci. 126, 2773-2784 (2013). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
1 33 293 321 499 512 627 643 698 738 859 
a a |S | eS | a | 
GTPase domain Middle domain PH domain GED PRD 
PH domain 
a4s Stalk a3s 
b 
[kD] M NI SN E CL P 
116 
—— aw Cy 
66 i Ba 
3 eee = 
3 = t—— 
25 = is bi 
18 j _ 
14 — it = 
c A 
a48 (aa 657-681) 
Bs 
S, 
AQ) : 
4 
Extended Data Figure 1 | Characterization of the dynamin 3 construct. whole-cell-lysate non-induced; SN, supernatant of cleared lysate; E, elution 
a, Top: the domain structure of dynamin 3. The previously used sequence- peak of the Talon-Co** column; CL, after cleavage with TEV protease; P, pool 
derived domain nomenclature is shown below. Bottom: a dynamin 3 monomer _ after gel filtration. c, Representative electron density map (stereo view). Two 
colour-coded according to the domain architecture. b, SDS-PAGE stalk helices are shown as stick models, the 2F, — F, map is contoured at 1.0c. 
representing a typical purification of dynamin 3. M, marker proteins; NI, Raw data for b is available in Supplementary Information. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Stalks 


‘Short dimer’ via interface 1 ‘Long dimer’ via interface 3 


previous dimer models are NOT in agreement with tetramer structure 


proposed [2 / |3 rearrangement for superconstricted state 


Dyn1-I481D/H687D/L688S 
Dyn3-1481D/H677D/L678S 


Dyn1-R361S/R399A 
Dyn3-K361S/R399A 


c(s) /AU/S) 


$20,(8) 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 2 | Dimerization of dynamin 3. a, Superposition of 
dynamin 1 (grey; PDB code: 3SNH) and dynamin 3 (magenta and green) 
dimers, colour-coded as in Fig. 1. The stalk arrangement in dynamin 3 is 
essentially the same as in dynamin 1. b, Interface 2 in dynamin 3. The view is 
rotated by 90° with respect to a. The zoom shows the side chains of residues 
involved in interface formation. Residues, whose mutation render dynamin 1 
and dynamin 3 monomeric, are marked with an asterisk. c, Top: stalks of the 
dynamin 3 tetramer, as seen in the crystal structure (left). Dynamin dimers 
(dark and light blue) are formed via interface 2 (12) and assemble into the 
tetramer via interfaces 1 and 3 (I1 and J3, respectively). In alternative 


LETTER 


dimerization models (middle and right)°’, dynamin monomers assemble via 
interface 1 (middle) or interface 3 (right) to form elongated dimers of different 
shapes. Bottom: arrangement of stalks of a dynamin 1 as fitted into a cryo-EM 
density map of a super-constricted dynamin 1 helix (PDB code: 4UUD)”*. 

d, Oligomeric state of dimer interface mutants, as assayed by analytical 
ultracentrifugation at a protein concentration of 20 UM. The following 
molecular masses were obtained from c(s) analyses: dynamin 1(R361S/R399A) 
(176 kDa, dimeric) in dark blue; dynamin 1(1481D/H687D/L688S) (84 kDa, 
monomeric) in light blue; dynamin 3(K361S/R399A) (165 kDa, dimeric) in red, 
dynamin 3(1481D/H677D/L678S) (83 kDa, monomeric) in black. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


hydrophobic interaction 


polar interaction 


GD side chain in outer 
molecule of dimer 1 


a) side chain in inner 


molecule of dimer 1 
EP side chain in outer 

molecule of dimer 2 
side chain in inner 
molecule of dimer 2 


THe 


eGFP 
mRFP- 
clathrin 
merge 
eGFP 
mRFP- 
clathrin 
merge 
eGFP 
mRFP- 
clathrin 
merge 
eGFP 
mRFP- 
clathrin 
merge 
eGFP 

i mRFP- 
clathrin 
merge 


LELS354-357AAAA 


dynamin2-eGFP 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 3 | Dynamin assembly via interface 3. a, Schematic 
overview of the interactions in interface 3. b, Details of loop LIN®. The 2F, — F- 
electron density is contoured at 1.00. c, The Charcot-Marie-Tooth-related 
mutation G358R is located at the C-terminal end of loop L1N®. It probably 
disturbs the structural integrity of this loop and therefore might interfere with 
oligomerization. d, Clathrin-coated pit dynamics in HeLa cells expressing 
interface 3 mutants of dynamin 2. HeLa cells treated with dynamin 2 siRNA 
were co-transfected with plasmids encoding eGFP or siRNA-resistant dynamin 


LETTER 


2-eGFP and mRFP-clathrin-light-chain, and live cells were imaged at 37 °C by 
TIRF microscopy. Shown are representative time-resolved line scans 
(kymographs) from at least ten time-lapse recordings of individual cells. 
Attenuated clathrin-coated pit dynamics upon depletion of endogenous 
dynamin 2 are only rescued by re-expression of wild-type but not mutant 
dynamin 2-eGFP. Note that the dynamin 2 mutants tested displayed a more 
diffuse subcellular distribution although they were still recruited to clathrin- 
coated pits. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


~40° 


4 


——/" Ss) PH domain 


Dynamin 1 
Dynamin 3 


Ti 


rs r at 
7. 1MS Outer molecule a= “a 1M§ Inner molecule 


Electron density 

= represents neighboring 
P ) stalk (see section e) and 
) symmetry related 

G domain 


— 


~ PH domain model 


PH domain for inner molecule 


in outer molecule 


Outer molecule Inner molecule 


- PH domain model 
\ \ for inner molecule 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 4 | Localization of the PH domain in the tetramer. 
a, Superposition of an outer dynamin molecule (magenta) and an inner 
molecule (green) of the dynamin 3 tetramer. The comparison reveals an ~40° 
rotation of the G domains and BSEs. Furthermore, the PH domain is visible 
only in the outer molecule. b, Superposition of the stalk and PH domain in 
dynamin 1 (grey) and dynamin 3 (magenta). c, Connectivity of PH domain 
and stalk in the outer molecule. Shown are the stalk and PH domain of an outer 
molecule (magenta) and the stalk of the corresponding inner molecule (green) 
from a dimer. Since the gap of ~58 A between V629 of the PH domain and 
P643 of the inner stalk is too large to be spanned by the missing 13 residues 
(grey dashed line), we can unambiguously assign the PH domains in dynamin 
3 to the outer stalks (black dashed lines). All other potential connections 
including molecules from the second dimer or symmetry-related tetramers 


LETTER 


span even larger distances (not shown). In the crystal structures of dynamin 1, 
an unequivocal assignment of the PH domain to a specific stalk was not 
possible, due to the long unresolved linker regions between the stalk and the PH 
domains. Concomitantly, the impact of the interface between stalk and PH 
domain has not been generally recognized*’. d, The outer PH domains are 
clearly defined in the electron density (left panel), whereas no density for a PH 
domain is observed in the equivalent position at the inner stalks (right panel). 
The density visible in the right panel corresponds mainly to a G domain 
from a symmetry-related molecule. The 2F, — F, electron density is contoured 
at 1.00. e, Modelling of a PH domain (grey) relative to an inner stalk (green) 
in the same geometry as seen in the outer molecules leads to steric clashes 
(black oval) with an adjacent stalk (blue). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
b 
c 
40 nm “40 nm 
d Liposomes: Not 0.05 01 0.2 04 08 4.0 
. ext. um ym um um pum um 
wa ey we ee et ew ia ee 
sPSPSPSPSPSPSP SP 
° Sere Res : 
Liposomes Liposomes Liposomes. 
+GMPPCP +GMPPCP 
+APH 
f eGFP mRFP-clathrin g 
RB 14 
c 
8 12: 
5 1.0 
=! 
re = 08 
? = 0.6 
7 E 0.4 
: 2 o2 
- . 
3 oo fet }s 
tf & F @ 
a 
APH 9 8g <1 


+ sIRNA DNM2 


scrambled 
siRNA 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 5 | The PH domains regulate oligomerization of 
dynamin. a, In the absence of liposomes, a dynamin 3 variant lacking the PH 
domain (APH) was sedimented more efficiently than wild-type dynamin 3 
(WT). Both APH and wild type lacked the PRD. The proteins were sedimented 
by ultracentrifugation after 20 h of incubation at low salt concentrations 

(60 mM NaCl) in the presence of the non-hydrolysable GTP analogue 
GMPPCP. S, supernatant; P, pellet fraction. b, c, Representative negative-stain 
electron micrographs of wild type (b) and APH (c) under the same conditions 
as in a. For each protein, at least eight micrographs were recorded. Both 
constructs showed oligomeric ring structures, similar to structures seen for 
full-length dynamin’. Our data indicate that oligomerization of dynamin 
does not require membrane binding, but membrane binding requires 
oligomerization (Fig. 2). d, In liposome co-sedimentation assays, dynamin 3 


LETTER 


bound to Folch liposomes independently of their size. Not extr., not extruded. 
e, At physiological salt concentrations (150 mM NaCl), dynamin 3 efficiently 
tubulated unfiltered Folch liposomes. In contrast, APH did not decorate the 
liposome surface and did not induce liposome tubulation. For each setup at 
least 12 micrographs were recorded. f, When expressed in HeLa cells, dynamin 
2(APH) formed large cytosolic aggregates that did not co-localize with mRFP- 
clathrin. Arrowheads indicate co-localization for wild-type dynamin 2. 
Shown are magnified insets of representative images from at least 20 individual 
cells, acquired by TIRF microscopy. g, Dynamin 2(APH) was dominant- 
negative in transferrin uptake assays. Data shown represent mean ~ s.e.m., the 
number of independent experiments is indicated in the bar. Raw data for a and 
d is available in Supplementary Information. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


WT, 10min 37°C 


WT, 10min 22°C 


R518D, 10min 22°C 


uM GTP 


R518D, 10min 37°C 


absorption 280nm [AU] 


without liposomes 


0 2 4 6 8 10 12 14 16 18 20 22 24 0 200 400 600 800 1000 1200 
elution volume [ml] t[s] 


a S P S P S P Ss P S P 
5 
< a nh ho 
n 
g & $ 9 z 8 
£ rr) © = = 
8 i @ © © 
o 
° 
2 
f 
1.4 
1.2 
e 8 
pein! PS oo oe cen eos ote a as paseo dade snlecddendnn edad ect ceseciall 
- liposomes + liposomes 3 1.0 
20 400 PA 
Oo 08 
2 
— 
= = 3 0.6 
€ € iN 
= = 200 G 
=, = = 0.4 
o no = 
2 a fo) 
[-) [-} i— 
-_ = 0.2 
[ 3 a 3| [3 
0 of Ee) 0.0 
o x 7) x 
bE ge 3 8 a a a a a ea 
o 98 © So © oO © oO 8 5 3 8 
wu ¢ «£ u ef a 2 e Ww ind ind i 
os 24 
ra) ez 
8 5 + siRNA DNM2 
oO 
no 


©2015 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 6 | Mutational analysis of the interface between PH 
domain and stalk. a, Analytical gel filtration analysis for wild-type dynamin 3 
and the mutant R518D. The proteins were pre-incubated for 10 min at 22 °C or 
37 °C. When pre-incubated at 37 °C, only R518D showed a higher molecular 
weight species. AU, arbitrary units. b, Intrinsic GTPase activity of wild-type 
dynamin 3 and the mutant R518D at 37 °C in the absence of liposomes. The 
lines represent linear fits of GTP hydrolysis versus time. For R518D, a biphasic 
behaviour of the GTPase activity was apparent (for wild type: k,,, = 0.5 min~ I, 
for R518D: kgps1 = 2.2 min and kj, = 13.3 min '). This biochemical 
behaviour is reminiscent of dynamin 1 mutants in the PH-domain-stalk 
interface that show increased oligomerization and GTPase rates when 
incubated at 37 °C”. Perturbations in this interface appear to promote 
oligomerization of dynamin, pointing to an autoinhibitory function of this 
interface for oligomerization. The average of two independent measurements is 
shown with deviations ranging from 0% to 0.05% for wild type and 0% to 0.62% 
for R518D. ¢, Analytical ultracentrifugation experiments for the indicated 
dynamin 3 variants, as in Fig. 2a. For the mutant K361S that sediments as a 
single species, a molecular mass of 164 kDa could be obtained from c(s) analysis, 


LETTER 


indicating that this mutant forms dimers in solution. d, Liposome co- 
sedimentation analysis for the indicated mutants. S, supernatant; P, pellet 
fraction. e, GTPase activity of the indicated mutants in the absence and 
presence of liposomes. Shown is the average of two independent measurements, 
with deviations ranging from 1% to 11%. f, Ability of dynamin 2 mutants to 
rescue defective CME of transferrin in absence of endogenous dynamin 2. The 
assay was performed as described in Fig. 2d. R518 in dynamin 3 corresponds to 
R522 in dynamin 2 and the R522H mutation in dynamin 2 is implicated in 
centronuclear myopathy. Data shown represent mean + s.e.m., the number of 
independent experiments is indicated in the bar. Note, we generally observed 
that the GTPase experiments were the most sensitive indicators of structural 
perturbations induced by mutations. Compared to membrane binding assays, 
GTPase assays appear to be more sensitive to the actual architecture of the 
dynamin oligomer and alterations induced by point mutations. Transferrin 
uptake assays could be influenced by cellular factors, such as BAR-domain 
protein that may stabilize mutant dynamin forms with deficits in 
oligomerization. Raw data for d is available in Supplementary Information. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


log 


timescale [ns] 


— Lys376 
Lys445 “Ue Arg343_ AXLys361 4 Glu368 


K361S, stalk + PH 
5 10 15 20 25 3 


Asp349 


timescale [ns] 


Wt, L1NS-loop 


10 15 
lag time [ns] lag time [ns] 
d 
loop L1NS Stalk - PH domain 
interactions interactions 

L1NS Stalk Stalk PH domain 

Glu355 Lys361 Glu355 Arg518 

Glu355 Arg364 Lys361 Glu607 

Glu355 Arg369 Arg364 Glu607 

Glu355 Arg343 Glu368 Arg618 

Glu355 Lys441 Arg369 Glu607 

Glu355 Lys445 Lys376 Asp610 

Asp352 Lys361 Lys376 Glu607 

Asp352 Arg364 

Asp352 Arg343 

Asp352 Lys441 

Asp352 Lys445 

Asp349 Lys342 

Asp349 Lys441 

Asp349 Lys445 
Extended Data Figure 7 | Molecular dynamics simulations and Markov indicating approximate Markovianity. The grey area indicates the region with 
models. a, The PH-domain-stalk interaction is characterized by a number lag times larger than relaxation timescales. c, Top: intrinsic conformation 
of mainly polar interactions. The represented conformation is one of the dynamics of the L1N® loop shown for the wild type (black) and the mutant 
starting structures (setup 2) for the MD simulations and quickly converts K361S (red). Bottom: six metastable conformations and their equilibrium 
into one of the metastable conformations shown in Fig. 3. b, Relaxation probabilities of the L1N® loop (setup 3) for the wild type (black) and mutant 
timescales of different constructs as a function of lag time computed from K361S (red) computed from the Markov model. d, Residue pairs used to 


Markov models. The timescales of all models (black) have converged ata lag characterize the L1N® loop and stalk-PH domain interactions. 
time of about 20 ns within statistical uncertainty (colour-shaded regions), 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


G domains * 
and BSEs 


Stalks 


Gdomains ae es 
and BSEs 


Stalks 


pH Bree 
domains- = 


PH domain 


Inner G domain 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 8 | Interactions of the G domain, stalk and BSE in 
the tetramer. a, Two views on a fitting of the dynamin 3 tetramer crystal 
structure into the EM density of non-constricted oligomerized dynamin 1 
(ref 24). The positions of the inner G domains are shown in all four molecules 
since the outer G domains in our crystals are stabilized by crystal contacts. 
Apparently, membrane binding and oligomerization is associated with 

major movements of the G domain, BSE and the PH domain (indicated by 
arrows). b, A loop of the outer PH domain and an inner G domain are in 
close proximity. c, The outer G domains (left), but not the inner G domains 


(right), are well defined in the electron density. The 2F, — F, electron density is 
contoured at 1.00. The weak electron density for the inner G domains and 
the resulting uncertainty in determining the contact sites prevented us from 
analysing this interaction in more detail. d, The BSE of an inner monomer 
(grey) interacts with the stalk of an outer monomer (magenta). This contact 
involves R465 which is mutated to tryptophan in some centronuclear 
myopathy patients. The R465W mutation leads to hyperactive dynamin‘ that 
fragments the T tubule network in mouse-myoblast-derived myotubes and 
Drosophila body wall muscle (Y.-W. Liu, personal communication). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


G domain 


@ - Centronuclear Myopathy 
@ - Charcot-Marie-Tooth Neuropathy 


Extended Data Figure 9 | Disease-relevant mutations in dynamin. Localizations of mutations leading to Charcot-Marie-Tooth neuropathy (black balls) and 
centronuclear myopathy (pink balls) are plotted onto a dynamin 3 monomer. Colour code as in Fig. 1. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 


Data collection and refinement statistics 


Data collection 
Space group 
Cell dimensions 


a, b, c (A) 


Resolution (A)* 
Reym (%)* 

<I/o(l)> * 
Completeness (%)* 


Redundancy 


Refinement 
Resolution (A) 
No. reflections 


Rwork / Riree (%) 


No. of protein atoms 


averaged B-factor protein (A*) 


R.m.s deviations 
bond lengths (A) 


bond angles (°) 


P2,2,2,, 1 tetramer / ASU 


97.70, 98.00, 401.52 


3.70 (3.70-3.80) 


7.0 (130) 


16.6 (1.8) 


99.6 (99.3) 


7.3 


49.47 — 3.7 


42,058 


23.2 / 27.8 


18,654 


212 


0.004 


0.889 


* Data in highest resolution shell are indicated in parenthesis. 


©2015 Macmillan Publishers Limited. All rights reserved 


LILLIE PAQUETTE, MIT SCHOOL OF ENGINEERING 


THE CELL MENAGERIE: 
HUMAN IMMUNE PROFILING 


Cutting-edge tools and analyses are digging deeper than ever before 
to unveil the intricacies of the diverse human immune system. 


Advanced technologies enable researchers at the Massachusetts Institute of Technology to observe individual immune cells attacking tumour cells. 


BY MARISSA FESSENDEN 


accines save lives — but they don't 
\ always work. Take the annual influ- 
enza shot: by some estimates, flu 
vaccines are only 50-70% effective even when 
well matched to the virus strains in broad cir- 
culation. Despite all the research, scientists 
still cannot predict whether a given vaccine 
will work for any specific person. 

Learning to make vaccines that protect more 
people means getting a better handle on the 
immune system — a bewildering militia of 
cells that communicate to detect and destroy 
pathogens. So far, attempts to parse the sys- 
tem’s complexity have involved work on mice, 
rats, rabbits, dogs, non-human primates and 
even lampreys and sea urchins. Yet results do 
not always translate to the one species that 


medicine cares most about. “There has been a 
vast zoo of animal models, but the one animal 
model we haven't yet exploited is us — Homo 
sapiens,’ says Bali Pulendran, an immunologist 
at Emory University in Atlanta, Georgia. 
Now, researchers are tackling the most 
difficult animal to study as never before. 
Advances in technology are helping scientists 
to dive deeper into the inner workings of single 
cells and carry out analysis on greater numbers 
of cells at once. Efforts in data analysis, shar- 
ing and collaboration promise to enable work 
that is too expensive for individual labs. Ulti- 
mately, researchers hope to bring fresh insights 
to the clinic to protect and treat people using the 
power of an individual's own immune defences. 
The human immune system is incredibly 
diverse. Each class of immune cell is actually 
an army of subtypes. The elite forces — the 


lymphocytes, which recognize specific 
pathogens or wayward body cells — consist 
of natural killer (NK) cells, which quickly dis- 
patch infected or cancerous cells, and B and T 
cells, which bear receptors on their surfaces 
designed to recognize specific invaders. But B 
and T cells break down further: there are regula- 
tory T cells, T helper cells, memory B cells, naive 
Bcells and more, each with its own unique role. 
These lymphocytes coordinate in turn with cells 
such as macrophages and monocytes, which are 
further specialized for other functions. 

Diversity manifests between people, too. 
Even identical twins vary in terms of the 
exact molecules and cell profiles that fight off 
disease. From an evolutionary point of view, 
variability ensures that some members of a 
species will survive a deadly disease outbreak 
— but it confounds researchers. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 409 


© 2015 Macmillan Publishers Limited. All rights reserved 


| TECHNOLOGY | IMMUNOLOGY 


PROFILE OF AN IMMUNE CELL 


Complex droplets consisting of water, oil, cells and magnetic beads reveal 
how gene sequences are paired in individual immune cells. 


Rapidly moving 
oil phase 


Cells encased in the 


droplets are lysed, and 
genetic material adheres 
to the beads. 


.. and genetic 
material is prepared 
for sequencing. 


Magnetic 
bead 


A special nozzle 
encases individual 
cells in a core of 
water surrounded by 
oil and carrying tiny 
magnetic beads. 


>» Gender, ethnicity, genetic background 
and disease history all affect a person's immune 
response in unpredictable ways. They influence 
whether a vaccine will work, and whether some- 
one has allergies or an autoimmune disease 
— both resulting from an overactive immune 
system — or whether a person will develop 
cancer, which is caused in part by an inattentive 
system that fails to remove errant cells. 


VACCINES UNVEILED 

Instead of seeing confusion in such diversity, 
researchers such as Pulendran see opportunity. 
With the right combination of sophisticated 
technologies and data analysis, human vari- 
ation can offer a natural experiment in what 
underlies an effective immune response. 

This reasoning led Pulendran and his team 
to some groundbreaking research on why a 
vaccine for yellow fever works so well. Since 
immunization against the sometimes-deadly 
tropical disease began in 1937, only 12 cases 
have been reported among the hundreds of 
millions of people immunized. 

Scientists have long known that the vaccine 
spurs the body to produce T cells that can kill 
cells infected with the yellow-fever virus — but 
they did not know how. In 2009, Pulendran’s 
team published an analysis’ of changes in the 
state, number and types of immune cells in 
the body before and after vaccination. The 
group found that quantities of a protein called 
EIF2AK4 spike in key immune cells (mainly 
dendritic cells, which help T cells to identify 
invaders) just days after vaccination. The higher 
the spike in protein levels, the more anti-yellow- 
fever T cells are later produced. 

The close correlation suggests the existence 
of components that foster strong immune 
responses — at least for the yellow-fever vac- 
cine. Pulendran and his colleagues” have since 
discovered other proteins that predict similarly 
strong responses to vaccines for flu and menin- 
gococcal disease. Now, they are linking these 
types of marker to subpopulations of cells and 
classifying variation across individuals. 

One major reason that immune responses 
vary is the vast collections of receptors on the 


mRNA strand 


Lysed cell 


Beads are 
collected ... 


surfaces of T and B cells, which correspond to 
antibodies that are secreted by the latter cells. 
To produce a near-endless assortment of these 
Y-shaped molecules, lymphocytes shuffle their 
genes as they mature. The myriad receptors and 
antibodies that result enable the immune system 
to recognize many different pathogens. 

Researchers want to sequence genes for these 
receptors to work out what makes a potent 
immune response, and so gain clues for devel- 
oping vaccines and for designing therapies that 
could spur the immune system to fight cancer. 

But because each receptor is made from pro- 
teins encoded by at least two types of separately 
shuffled gene segment, sequences alone are not 
enough. Researchers must also learn how these 
proteins are paired in an individual cell — and 
which combinations show the most promise 
for fighting disease. 

At the University of Texas at Austin, chemical 
engineer George Georgiou has tackled this chal- 
lenge by studying B cells one at a time. He and 
his team’ first encase 


individual cells in Instead of seeing 
complex droplets: confusion in 
saan an athe such diversity, 
core that preserves te researchers 
cell’s genetic material; 

; such as 
an outer oil layer to Pulend 
keep the cells sepa- nie ene 

opportunity. 


rated; and magnetic 
beads that allow 
researchers to manipulate each droplet and so 
capture and extract the genetic material from 
individual cells. These data can reveal the anti- 
body repertoire elicited by various stimuli — 
crucial information for designing vaccines (see 
‘Profile of an immune cell’). 

Georgiou’s group hopes to publish manual- 
like methods so that others can use the 
technique. Investigators who are unfamiliar 
with it should prepare themselves for a steep 
learning curve, he says: “Every method, espe- 
cially new methods from academic labs, has 
some nuances.” But for those who are willing 
to put in the time and elbow grease, precise 
answers about individual immune cells await. 

Another sequencing approach relies on 


410 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


Isolated genetic material is 
sequenced, revealing the 
unique immune receptors 
present in each cell. 


specially formulated beads to tag individual 
cells before DNA analysis. The tags fuse with 
the cells’ genetic material and function as bar- 
codes that can be traced back to the original cell, 
even when cells are analysed in pools. Hedda 
Wardemann, an immunologist at the Max 
Planck Institute for Infection Biology in Berlin, 
has used this strategy to analyse genes encoding 
paired receptor proteins in more than 46,000 B 
cells at a time’. 

Most microfluidics sequencing platforms are 
developed by individual labs, such as Georgiou’s 
or Wardemann’, that have engineering know- 
how. But as the field grows, companies are 
getting into the game. One of the biggest players 
in the microfluidics field is Fluidigm in South 
San Francisco, California. 

This September, the company began to ship 
high-throughput chips, which can be used on 
the company’s C1 microfluidics platform to 
interrogate genomes of 800 individual cells in 
a single 6.5-hour run. Although it has a much 
lower throughput than Georgiou’s droplet 
method (which can process 6 million B cells ina 
day), Fluidigm’s technology requires less exper- 
tise. The company plans to increase throughput 
to nearly 100,000 cells per run in the near future. 


PERTURBED POPULATIONS 
In addition to profiling individual cells (see 
‘Nanoarenas for cell attacks’), researchers 
want to track how cell populations change in 
response to vaccination or infection. To iden- 
tify specific cell types, the scientists rely on 
protein markers studded on the cells’ surfaces. 
For example, two markers dubbed CD4 and 
CD8 both show up on certain types of mem- 
ory T cells — but CD8 is also on NK cells, and 
CD4 is on monocytes and dendritic cells. So, 
to measure only memory T cells, researchers 
may need to screen for three different mark- 
ers. To isolate an even more-specific subset, the 
number of markers must increase. 
Conventionally, researchers have relied 
on a cell-profiling technology called flow 
cytometry, in which coloured, fluorescent 
proteins are attached to specific cell markers 
so that combinations can be easily detected 


SOURCE: REF. 3 


YVONNE YAMANAKA 


and the cells scored or sorted. But overlaps in 
colour spectra generally limit analyses to as few 
as a dozen markers. 

The latest iteration of cell-profiling tech- 
nology — mass cytometry — uses rare-earth 
metals instead of fluorescence and can detect 
more than 40 markers. Because mass cytometry 
can identify so many cell types in a single sam- 
ple, more types of experiments can be done. 

Studies in babies, for example, are key to 
understanding the immune system's develop- 
ment. But infants generally cannot tolerate 
blood withdrawals of more than 4—5 millilitres 
— and even simple flow-cytometry experiments 
can require more than 10 ml. Mass cytometry, 
by contrast, can run on less than 4 ml. 

Mark Davis, a molecular immunologist at 
Stanford University in California, used mass 
cytometry to track hundreds of parameters 
— including 72 different immune-cell popu- 
lations — in the blood of 210 twins. His team 
found’ that much of the variation between 
people's immune systems can be attributed to 
environmental factors, rather than to genetic 
ones. Without mass cytometry, this work would 
have been too complex to perform, he says. 

DVS Sciences, now a part of Fluidigm, has 
invented a mass cytometer called the Cy TOF 
for use in cell profiling. The latest version (as 
well as upgrades for the older system) boosts 
sensitivity and sample-processing speed, and 
can run multiple samples at a time. 

But these technologies are expensive. The 
June version of the CyTOF — the ‘Helios’ 
system — starts at roughly US$500,000, not 
counting service contracts. At Stanford, Davis 
and other researchers rely on shared facilities. 


ASSEMBLING THE PIECES 

Although scientists are making progress, 
many tools have been slow to reach the clinic, 
says Padmanee Sharma, a physician-—scientist 
at the University of Texas MD Anderson 
Cancer Center in Houston. Every new clini- 
cal technology needs standards and quality 
assurances, which require extensive testing to 
establish. Clinical trials are only now adopting 
procedures that might help clinicians to track 
their patients’ immune responses and feed in 
to treatment decisions. 

Communication is another bottleneck. 
Information is accumulating rapidly and 
needs to be shared by collaborators as diverse 
as statisticians, clinicians, basic biologists 
and technologists. Coordinating research 
that involves human participants places huge 
demands on logistics, resources and expertise, 
and one major effort to facilitate such work is 
the Human Immunology Project Consortium 
(HIPC) funded by the US National Institutes 
of Health (NIH). The HIPC doles out grants 
to advance methods, and endeavours to extend 
the fruits of researchers labour to all. 

The consortium offers an online data-analysis 
and management platform called Immune- 
Space, which helps researchers to place data 


IMMUNOLOGY | TECHNOLOGY | 


Nanoarenas for cell attacks 


In addition to tracking populations of 
immune cells, researchers want to know 
how they interact. 

Christopher Love, an immuno-engineer at 
the Massachusetts Institute of Technology in 
Cambridge, is using microfluidics to probe 
how individual immune cells cooperate with 
each other. His lab engineers devices that he 
describes as “essentially ice-cube trays”: each 


in along-term archive called the Immunology 
Database and Analysis Portal (ImmPORT), also 
funded by the NIH. The HIPC is spearheading 
efforts to standardize procedures for commonly 
performed assays in cytometry as well as alter- 
nate methods of immune profiling, such as 
measuring antibodies in serum samples. 

Another emerging need is for techniques 
for easy cross-analysis of many data types, says 
Steve Kleinstein, a computational immunologist 
at Yale University in New Haven, Connecticut. 
“There’ a lot of subtlety in the data, and it’s very 
easy to pick up a piece of code or tool that some- 
body put out there on the web, run it with your 
data and get a plot that looks interesting — but 
that’s a very dangerous thing to do,’ he says. 

To help solve this problem, Kleinstein and 
his group have developed software called the 
Repertoire Sequencing Toolkit (pRESTO)’°, 
which offers a way to process, annotate and 
correct raw sequencing data from high- 
throughput platforms such as lumina. It also 
allows researchers to run their data in different 
computing environments and then return to 
the pRESTO environment. 

A separate tool, a web portal known as the 
VDJServer, is in beta-testing after launching 
in April. It offers the ability to analyse B- and 
T-cell-receptor data, with the goal of provid- 
ing an intuitive interface for users who have 
not done any programming, says project 
leader Lindsay Cowell, a bioinformatician 
and immunologist at the University of Texas 
Southwestern Medical Center in Dallas. The 
server will incorporate more analysis tools 
into the portal as they become available 
(Kleinstein’s pRESTO is already embedded). 
Moreover, the portal lets researchers share data 
and even tap into the computing power of the 


well in the tray holds sub-nanolitre volumes, 
as opposed to the tens of microlitres held by 
wells in more-conventional plates. 

Using these tiny arenas to watch natural 
killer (NK) cells home in on leukaemia cells, 
the team has discovered’ — unexpectedly 
— that even a single NK cell will attack a cell 
that does not belong. In the past, researchers 
suspected that NK cells coordinated their 
actions through secreted chemical signals, 
but now it seems that such cooperation may 
be necessary only among larger cell groups. 

Love and his team hope to map their 
understanding of interactions at this single- 
cell level to the immune system as a whole, 
and potentially compare healthy individuals 
with those who have cancer. “With these 
technologies, first you ask: can we define 
normal?” he says. “Then you can think about 
heterogeneity in disease.” WF. 


Texas Advanced Computing Center at the 
University of Texas at Austin. 

There is still an acute need for human 
immunology-specific data repositories, nota- 
bly for T- and B-cell-receptor sequencing 
data, says Jamie Scott, a molecular immunol- 
ogist at Simon Fraser University in Burnaby, 
Canada, who is co-leading an effort to share 
such data. 

But perhaps the biggest block is a basic one: 
a dearth of training. Most analysis requires 
some programming skills, says John Tsang, 
head of computational systems biology for the 
Trans-NIH Center for Human Immunology in 
Bethesda, Maryland. For now, most tools are 
limited to the specialist, he says; collaboration 
with those who can understand the program- 
ming is still the best way forward. 

Creating more collaborations should, in 
turn, help to ensure that the tools truly further 
basic knowledge and translate into practical 
applications. “It is very attractive to apply the 
latest gee-whiz ‘omics technology to measure 
things,” says Pulendran. “But I think we need 
to go beyond measuring and accumulation of 
data — to knowledge and to understanding” m 


Marissa Fessenden is a science journalist and 
illustrator based in Bozeman, Montana. 


1. Querec, T. D. et a/. Nature Immunol. 10, 116-125 
(2009). 

2. Li, S. etal. Nature Immunol. 15, 195-204 (2014). 

3. DeKosky, B. J. et al. Nature Med. 21, 86-91 (2015). 

4. Busse, C. E., Czogiel, |., Braun, P, Arndt, P. F. & 
Wardemann, H. Eur. J. Immunol. 44, 597-603 
(2014). 

5. Brodin, P. et al. Cell 160, 37-47 (2015). 

6. Vander Heiden, J. A. et al. Bioinformatics 30, 
1930-1932 (2014). 

7. Yamanaka, Y. J. et al. Integr. Biol. 4, 1175-1184 
(2012). 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 411 


© 2015 Macmillan Publishers Limited. All rights reserved 


ADAPTED FROM HAPPY TOGETHER/SHUTTERSTOCK 


CAREERS 


GENE EDITING One investigator’s experience 
launching a lab with CRISPR p.415 


NETWORKING Tips for introverts engaging in 
a professional setting go.nature.com/Soghtf 


NATUREJOBS For the latest career 
listings and advice www.naturejobs.com 


WORK ENVIRONMENT 


When labs go bad 


A toxic relationship between junior scientist and adviser 
can quickly turn career prospects sour. 


BY CHRIS WOOLSTON 


here is no crying in baseball, according 
Te a famous quote from the 1992 film A 

League of Their Own. But there is most 
certainly crying in science, says Isaiah Hankel, 
a former cell biologist turned author and career 
coach. He admits shedding a couple of tears in 
a bathroom cubicle after his graduate adviser 
screamed at him in front of the entire lab — 
all while another principal investigator (PI) 
looked on. “It was the craziest thing,” he says. 


But it did not come entirely out of the blue. 
During his fifth year of study, Hankel had been 
promised an industry job — under the condi- 
tion that he get his PhD first. Unfortunately, his 
PI was not on board with the plan. “He totally 
withdrew his support,’ he says. “I wanted to 
map out exactly what I needed to do for gradu- 
ation, but he would never nail it down” 

Like Hankel, many junior researchers come 
to realize that their relationship with their PI — 
the one person who is most in control of their 
careers — is not working out. “T’ve seen a lot of 


situations where people are having problems 
with their supervisors,” says Sarah Blackford, 
head of education and public affairs for the 
Society of Experimental Biology, headquar- 
tered in London. “People get very emotional, 
and things can escalate.” Blackford, who is 
based at Lancaster University, UK, and advises 
junior researchers throughout Europe, says 
that postdocs and graduate students in broken 
labs must work out the crucial next step. Are 
they going to endure a bad situation? Are they 
going to find a way to mend the relationship? 
Or are they going to jump ship? 

Whatever the decision — endure, repair or 
escape — the conflict will probably become a 
career turning point. Junior researchers who 
run afoul of their PIs may feel stuck, and they 
could end up with one fewer letter of recom- 
mendation than they had originally counted 
on, but that does not mean that their science 
days are over. With a positive attitude, a knowl- 
edge of institutional policies and some objec- 
tive, well-placed allies, it is possible to move 
on — to academia or beyond. 


A CHANGE IN TACK 
Hankel quickly realized that long hours and 
dedication were not going to be enough to 
break the impasse with his PI. “Working extra 
hard is exactly what my PI wanted,” he says. 
“I was getting more data for him. But if they 
aren't going to give you a target to hit, you can't 
keep spinning your wheels.” Instead of working 
harder, Hankel used some of his paid time off, 
giving himself time to make a plan. He started 
attending conferences, which he paid for out of 
pocket. That sort of networking, he says, can 
be especially important in times of conflict. He 
kept daily records of his interactions with his 
PI, and he saved all of the relevant e-mails. 

Most importantly, he set up meetings with 
his department head and several deans, and 
discussed with them his need for a clear path 
to graduation. He also consulted the school’s 
official graduate-school manual, which gave 
him a major source of leverage. Among 
other pronouncements noted in the manual, 
students were expected to graduate within five 
years, and advisers were supposed to actively 
support their students’ progress. Prompted by 
the meetings, his adviser finally told him the 
exact steps that he needed to take to finish his 
dissertation. With an exit plan in place, Hankel 
was able to get his degree about a year after all 
of the trouble started. 

Conflicts with senior scientists can be espe- 
cially bewildering for PhD students, says > 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 413 


© 2015 Macmillan Publishers Limited. All rights reserved 


THE BEST DEFENCE 


Look before you leap 


Many junior researchers who find 
themselves at odds with their advisers could 
have avoided trouble with a little preliminary 
research. For PhD students, it is helpful to 
find someone who has a history of turning 
trainees into scientists, says career adviser 
Karen Kelsky in Eugene, Oregon. 

“Be a good detective. Check with other 
graduate students and postdocs, and look 
at the track record,” she says. The statistics 
will tell the story — many ‘bad’ advisers 
have never guided a doctoral student 
through the point at which he or she actually 
earned a degree. Of course, some principal 
investigators are too new to have much 
history. In those cases, Kelsky says, students 
should check with prospective advisers 
to make sure that they are committed to 
helping students to earn their degrees. 

Postdocs too often take a scattershot 
approach to finding a lab, says Sofie 
Kleppner, an assistant dean in the office of 


> Karen Kelsky, a science job coach in Eugene, 
Oregon, and author of The Professor is In: The 
Essential Guide to Turning your PhD into a Job 
(Three Rivers Press, 2015). PhD students do 
not always have the interpersonal experience to 
handle rocky relationships, she says, and they 
are often unprepared for the rigid hierarchy of 
academia (see ‘Look before you leap’). “There 
are some aggressive advisers who like the power 
and just want to see a person get destroyed,” she 
says. Instead of letting a student defend a thesis 
and receive a PhD, they ask for one more rewrite 
or one more experiment, not because the work 
is crucial, but to remind the student who is 
really in command, she says. “That’s probably 
the most common story I hear,’ she says. 

In many cases, students can get their freedom 
by putting their head down and meeting every 
request, even if it seems wrong or unhelpful. 
“That’s what ended up happening to me,’ says 
Kelsky, who has a PhD in cultural anthropology. 
“Trevised my dissertation by taking out every- 
thing my adviser hated and putting in every- 
thing she liked.” As they approach the finish 
line, she says, students should think less about 
their literary legacy and more about making 
their PI happy. “A lot of graduate students are 
obsessed with their dissertations, but the fact 
is that nobody is going to read them. They 
shouldn't get so worked up.’ 


SEARCH FOR ALLIES 

One cognitive scientist, who asked not to be 
named, received her PhD from a prestigious 
university on the US West Coast. She says 
that her relationship with her PI fell apart in 
the fourth year of a five-year programme, a 


postdoctoral affairs at Stanford University 
in California. “Some of them will spam the 
entire university looking for a position,” she 
says. “They spam me, and it’s been a long 
time since I’ve had a lab.” 

Instead, they should conduct a much 
more focused search for a lab that is 
compatible with their personality, rather 
than just their scientific interests. She 
recommends that postdocs give a talk to 
the principal investigator and members of 
a prospective lab, creating an important 
opportunity for both sides to look for a 
good fit. In addition, they should set up an 
in-person chat with the adviser — and have 
lunch or dinner with other people in the lab. 
This is the chance to ask a question that 
could prevent a lot of future trouble: what is 
the worst thing about working in this lab? 

If the complaints run far beyond the 
normal scientific grumblings, it is better to 
keep looking. C.W. 


particularly vulnerable time in her education. 
A combination of misdeeds, misunderstand- 
ings and hurt feelings left her wondering 
whether she should abandon the programme 
and start again. Among other questionable 
behaviours, her adviser seethed when she 
did some work with a rival lab during her 
adviser’s sabbatical. When her adviser gave 
one of her projects to another student, she felt 
the relationship was irretrievably damaged. 
But instead of quitting her PhD programme, 
she had coffee with a faculty member who 
helped her to look at the big picture. “She said 


I shouldn't throw away four years of work” The 
same faculty member stepped up to become 
the co-chair of the student’s committee, a 
position from which she could ensure that the 
degree process would be fair and unbiased. 
“She made sure I wasn't retaliated against,” says 
the cognitive scientist, who is now a tenure- 
track assistant professor at a US university. 

For postdoctoral researchers, conflicts with 
PIs can cause a lot of soul-searching and career 
angst, says Sofie Kleppner, an assistant dean 
in the postdoctoral-affairs office at Stanford 
University in California. “It's a huge issue if 
youre in a lab and you feel like it’s the wrong 
lab for you,’ she says. In her experience, post- 
docs often feel as if they and their advisers are 
not on the same page. “One of the biggest prob- 
lems is mismatched expectations,” she says. “A 
postdoc might want to be independent, but a 
PI might be the type who likes to check in. That 
can cause a lot of frustration”” 

In some cases, simple misunderstandings 
can cause a lot of tension. “A postdoc might tell 
me that they don’t want to go into academia, 
but they’re afraid to tell their PI” Kleppner 
says. “And then the PI will say that he’s wor- 
ried because the postdoc doesn’t seem cut out 
for academia.” The upside of simple misunder- 
standings, she says, is that they often have an 
equally simple solution: talking about it. 


THE ART OF CONVERSATION 

As professional scientists, postdocs need to 
take a business-like approach to conflicts 
with their PIs, Blackford says. That means 
communication — and a lot of it. “You have 
to talk about the situation without getting 
personal,” she says. “Set up a meeting with a 
proper agenda.” Blackford adds that not all PIs 
are especially approachable or easy to talk to. 
If one-on-one conversations do not completely 
solve the problem, she recommends finding an 


What You Need To Know About 
the Academic Job Market 


; Grant-wr 


ting 


Science-career coach Karen Kelsky helps PhD students to navigate the job world. 


414 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


KAREN KELSKY 


JOS SCHMID/UNIV. OF ZURICH 


impartial faculty member who is willing to 
offer confidential advice. 

In some cases, Blackford says, discussing 
the situation with an objective ally can help 
disgruntled junior researchers to understand 
the true source of their discontent. “Some 
people can't even put a finger on what's 
gone wrong,’ she says. “They just don't feel 
respected, and then they havea crisis of con- 
fidence. It’s helpful to talk with someone who 
can tease out what youre saying.” 

Postdocs should develop on-campus 
allies who can serve as sounding boards and 
counselors. “I tell people to identify their 
peer support and mentors early,’ Kleppner 
says. “You need someone who can advocate 
for you if something isn’t working out.” 
Adding another person to the conversa- 
tion can be a quick way to find compromise 
and clarity, she says. “It’s basic “Conflict 
Resolution 101’” 

Not all conflicts can be resolved — some 
postdocs eventually decide to leave a lab for 
good. “These are high-powered people who 
don’t want to admit failure,” Kleppner says. 
“But it’s OK to admit it” When it is time to 
leave, professionalism is more important 
than ever. She recommends explaining the 
decision to a Plin clear, dispassionate terms 
— the same tone that is needed when talk- 
ing to other PIs about a possible job. Natu- 
rally, they will want to know why the last job 
did not work out, but they don’t want to be 
dragged into the drama. A postdoc who can 
clearly communicate why the last lab was 
not an ideal fit — without making any per- 
sonal attacks on his or her former PI — will 
have a good chance of moving on. “You're 
not going to ruin your reputation as long as 
you don't ruin anyone else’s,” says Kleppner. 

Hankel managed to leave academia with 
his reputation — and his degree — intact. 
As a career consultant, he now encourages 
other scientists to stand up for themselves 
even when the hierarchy is tipped against 
them. He notes that some scientists end up 
spending so many years doing their PhD and 
multiple postdocs that they barely have time 
to establish their careers before retirement. 
“Advisers hold the keys to people’ lives,” he 
says — which means that it is important to 
resolve disputes as quickly as possible and 
avoid spending too much time ina lab that 
will not promote a junior researcher's pro- 
gress. When a PJ is not being supportive, 
Hankel says, early-career researchers have to 
prioritize their professional interests — even 
if that means hurt feelings, bruised egos and 
achange of venue. “It’s always appropriate to 
have self-respect; he says. m 


Chris Woolston is a freelance writer in 
Billings, Montana. 


A forced lab move can be a hassle. Find out how 
to handle it seamlessly in an upcoming issue of 
Nature Careers. 


TURNING POINT 
Martin Jine 


Structural biologist Martin Jinek helped to 
launch the genome-modification craze that 

is upending biological research. Now running 
his own laboratory at the University of Zurich 
in Switzerland, Jinek describes how research is 
changing as CRISPR — a gene-editing tool with 
the potential to cheaply alter plants, animals 
and even human embryos — takes hold. 


Did you set out to work on CRISPR after 
completing graduate school? 

No. When I started as a postdoc in Jennifer 
Doudna’s group at the University of California, 
Berkeley, in 2007, we knew practically nothing 
about CRISPR, which stands for ‘clustered reg- 
ularly interspersed palindromic repeats: The 
first paper describing it as an adaptive immune 
system in bacteria came out early that year 
(R. Barrangou et al. Science 315, 1709-1712; 
2007). Although Doudna was one of the first 
to explore CRISPR, my original project was on 
the molecular mechanisms of microRNA. But 
the CRISPR field became more interesting, so 
I collaborated with some group members and 
finally began my own project working on Cas9, 
an enzyme that cuts DNA. 


When did it become clear that CRISPR was a 
game changer? 

We were interested at first because it looked 
similar to RNA interference, in which RNA 
molecules inhibit the expression of genes. 
But the molecular machinery was intriguingly 
different. The wider implications — and its 
potential utility in genome research — came 
only after we learned that it cuts double- 
stranded DNA and is programmable, which 
made it even more interesting to work on. 


What is most surprising about this technology? 
How quickly it has developed. Within six 
months of publishing a paper showing that 
CRISPR can be programmed (M. Jinek et al. Sci- 
ence 337, 816-821; 2012), three labs — includ- 
ing ours — were using it as a genome-editing 
tool. Within 12 months, researchers were apply- 
ing it to many cell types and organisms. 


How is CRISPR shaping your research agenda? 
My goal is to understand how the system 
actually works. My resources are not unlim- 
ited, so I focus on what I do well — structural 
biology. Five of the ten people in my lab, which 
began in 2013, are aiming to gain a better 
structural understanding of the DNA-cutting 
mechanisms in CRISPR systems so that we can 
engineer the system to be more efficient and 
versatile. The CRISPR technology is finding 


applications in basic-research labs, as well as in 
biotechnology and molecular-medicine labs, 
to potentially cure genetic disease or engineer 
organisms to make biofuels. I’m already using 
it to address other research questions. 


What did you take from your experience as a 
graduate student in a new lab? 

I was the third PhD student in Elena Conti’s first 
laboratory, at the European Molecular Biology 
Laboratory in Heidelberg, Germany. She was 
a fantastic mentor, and being in her lab at an 
early phase of her career has shaped my own 
lab. She was a tough boss, but she taught me 
how to approacha scientific problem to find the 
right questions, and how to do good science to 
answer those questions. 


Has the public reaction to CRISPR had an 
impact on your work? 

On some level, we anticipated it would be big. 
We just didn’t know how big. The wider societal 
and potential ethical issues associated with the 
use of CRISPR, especially those that relate to 
human-genome modification, have generated 
a lot of attention. The negative side of working 
in the CRISPR field is that it is so competitive, it 
leaves little time for anything else. = 


INTERVIEW BY VIRGINIA GEWIN 


This interview has been edited for length and clarity. 


CORRECTION 

The Careers feature ‘Mind Wide Open’ 
(Nature 525, 147-148; 2015) stated 

that BEST had offered career training to 
about 10,000 graduate students and 600 
postdocs since its launch. In fact, at least 
4,000 postdocs have benefited. 


17 SEPTEMBER 2015 | VOL 525 | NATURE | 415 


© 2015 Macmillan Publishers Limited. All rights reserved 


Ua SCIENCE FICTION 


WADING INTO WATER 


BY TODD HONEYCUTT 


eall come from the sea. 
That’s what the science 
books tell us, and what I 


think of as I walk the beaches near my 
home, listening to the ocean waves 
sing their songs against the earth; 
watching the gulls struggle with 
each other for scraps of food; 
collecting odd debris and shells 
pushed up by the waves. 

What I think of as I remem- 
ber my daughter. 

Our last conversation — our 
last face-to-face conversation — 
occurred almost two years ago. 
Pearl had called, said she wanted 
to see me, had something impor- 
tant to say. I asked her to come to 
my shore house, because the city per- 
plexes me more than it did when I was 
younger, and the tone of her voice made 
me think that it would be better on my turf 
than hers. 

We weren't estranged, Pearl and I. After 
her mother’s and my divorce, and her flow- 
ering into her own as an adult, we had just, 
as they say, grown apart. A natural progres- 
sion of our lives, I guess, as children stand on 
their own, find their own paths. But I wished 
things had been different. 

Still do. 

Pearl came the following weekend, a 
Saturday morning in September. We bought 
coffees and walked to the boardwalk, found 
a bench facing the ocean. People and 
umbrellas dotted the sand between us and 
the sea. No cloud in that tranquil sky tem- 
pered the brightness of the Sun. 

We talked of pleasantries, memories of 
beach vacations past. Then Pearl cut to the 
core of the issue. 

“Tm going to be uploaded.” 

“Uploaded?” I said. 

Thad heard of it, of course, but hadnt paid 
any attention. Didn't concern me. Not much 
in the news did. One of the advantages of 
growing old — nothing seems newsworthy 
any more. 

Almost nothing. 

“A friend of a friend has me on a list. It’s 
beyond the experimental stage now. It’s safe.” 

“But you lose...” 

“Tlose this body, and I gain so much more. 
It’s the new frontier, Dad” 

As if it were the Wild West or the Moon 
base. 


On the shores of memory. 


What can a father say to the choices his 
children make? How many times had my 
pleas had no effect? Or worse, cornered her 
to become more firmly entrenched? 

When my words had ebbed, Pearl filled the 
space with information about the procedure. I 
couldn’ hear her words, or what I could hear, 
I couldn't make sense of. I know more now. 
How the brain is put in a small vat, bathed in 
salts and chemicals and solutions. Wired so 
that its consciousness is free to roam worlds 
both virtual and real. Entire civilizations rise 
and fall, fantastic landscapes more strange 
than any I can imagine have people living, 
working, achieving, actualizing. Perhaps it is 
the new frontier, but it’s hard for me to under- 
stand a world without the taste of food and 
drink, the feel of sand and water. 

Yes, I know more about it now. 

“Honey, it seems so permanent, so...” 

“Tl still be around” Pearl touched my arm. 
“Tl still be me. Just... ina different form.” 

I finished my coffee, held the cup against 
my leg to keep the wind from tossing it 
along the boardwalk. I looked at Pearl, at the 
woman she had become. Strong, independ- 

ent. Her stubborn- 


SD NATURE.COM ness had matured 
Follow Futures: into assuredness 
W @NatureFutures and confidence. She 


EG gonature.com/mtooim had parleyed her 


418 | NATURE | VOL 525 | 17 SEPTEMBER 2015 


© 2015 Macmillan Publishers Limited. All rights reserved 


overwhelming curiosity and political astute- 
ness into a career transforming boutique 
biotech companies into global players. 
I was proud of her. I just wished I could 
have shared her excitement. 
“When?” 
“Not sure, but soon.” 

“What about your mother?” 

“T haven't told her.” 

“She'll be heartbroken.’ Easier 
to say than that I was heart- 
broken. 

“Some day,’ she said, “maybe 
you both will join me” 
Some day, her mother would. 

“Let’s go swimming,’ I said. 

“Swimming?” 

I pointed to the ocean. 

“T didn't bring a suit” 

“TI know a store down the road”” 
I pointed behind us. 
“Dad, really” 
“Please.” 
She acquiesced. 

We waded into the water, my daughter 
and me, for what would be the last time. 
It wasn't Pearl at age two, screaming at the 
monstrosity of the ocean, its vastness. It 
wasn't the joy of Pearl at nine, excited at 
each wave, her whole body giggling as she 
fought and swam and dived. It wasn’t the 
angst of Pearl at 15, the constant churn- 
ing of, and chattering with, friends. It was 
Pearl as an adult. The water took us and 
allowed us to be together in a way we hadn't 
in years. We floated and body surfed, smiles 
on our faces as the waves crashed over us 
and threw us into the sand, the crunch of 
shells under our feet, the briny taste of the 
water filling our mouths, and in between, 
our small talk of city life and beach life and 
our shared memories of family and each 
other. Throughout it all, the sounds of the 
ocean rising and falling, the weight of my 
heart, rising and falling. 

The papers are signed, my name ona list. 
I focus on the tastes and touches and sounds 
I encounter, as if they'll be the last Pll know. 
And each day, I sit on the boardwalk and 
wonder, as the waves rise and fall, whether 
I'll feel the same as our last day in the ocean 
when I next see my daughter. = 


Todd Honeycutt is a public-health 
researcher in New Jersey who enjoys 
thinking about alternatives (and 
alternatives to alternatives). His stories have 
appeared in Fiction Vortex. 


ILLUSTRATION BY JACEY 


